April 14, 2018 by adragt

Research Blog: Looking to Listen: Audio-Visual Speech Separation

Holy moly this will be some next level spy shit. Guaranteed.

In “Looking to Listen at the Cocktail Party”, we present a deep learning audio-visual model for isolating a single speech signal from a mixture of sounds such as other voices and background noise. In this work, we are able to computationally produce videos in which speech of specific people is enhanced while all other sounds are suppressed.

Andy Dragt

Research Blog: Looking to Listen: Audio-Visual Speech Separation

Leave a Reply Cancel reply