DocumentCode :
3541284
Title :
Reverberant speech separation based on audio-visual dictionary learning and binaural cues
Author :
Liu, Qingju ; Wang, Wenwu ; Jackson, Philip ; Barnard, Mark
Author_Institution :
Centre for Vision, Univ. of Surrey, Guildford, UK
fYear :
2012
fDate :
5-8 Aug. 2012
Firstpage :
664
Lastpage :
667
Abstract :
Probabilistic models of binaural cues, such as the interaural phase difference (IPD) and the interaural level difference (ILD), can be used to obtain the audio mask in the time-frequency (TF) domain, for source separation of binaural mixtures. Those models are, however, often degraded by acoustic noise. In contrast, the video stream contains relevant information about the synchronous audio stream that is not affected by acoustic noise. In this paper, we present a novel method for modeling the audio-visual (AV) coherence based on dictionary learning. A visual mask is constructed from the video signal based on the learnt AV dictionary, and incorporated with the audio mask to obtain a noise-robust audio-visual mask, which is then applied to the binaural signal for source separation. We tested our algorithm on the XM2VTS database, and observed considerable performance improvement for noise corrupted signals.
Keywords :
audio streaming; learning (artificial intelligence); probability; source separation; speech processing; video signal processing; audio mask; audio-visual coherence; audio-visual dictionary learning; binaural cues; noise corrupted signal; probabilistic model; reverberant speech separation; source separation; synchronous audio stream; time-frequency domain; video signal; visual mask; Abstracts; Conferences; Dictionaries; Educational institutions; Lapping; Signal processing; Binaural source separation; audio-visual dictionary learning; interaural difference; matching pursuit; noise reduction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Statistical Signal Processing Workshop (SSP), 2012 IEEE
Conference_Location :
Ann Arbor, MI
ISSN :
pending
Print_ISBN :
978-1-4673-0182-4
Electronic_ISBN :
pending
Type :
conf
DOI :
10.1109/SSP.2012.6319789
Filename :
6319789
Link To Document :
بازگشت