Research Blog: Looking to Listen: Audio-Visual Speech Separation

From the Google Research Blog

Holy moly this will be some next level spy shit. Guaranteed.

In “Looking to Listen at the Cocktail Party”, we present a deep learning audio-visual model for isolating a single speech signal from a mixture of sounds such as other voices and background noise. In this work, we are able to computationally produce videos in which speech of specific people is enhanced while all other sounds are suppressed.

Leave a Reply

Your email address will not be published. Required fields are marked *