Holy moly this will be some next level spy shit. Guaranteed.
In “Looking to Listen at the Cocktail Party”, we present a deep learning audio-visual model for isolating a single speech signal from a mixture of sounds such as other voices and background noise. In this work, we are able to computationally produce videos in which speech of specific people is enhanced while all other sounds are suppressed.