Title :
Bayesian separation of audio-visual speech sources
Author :
Rajaram, Shyamsundar ; Nefian, Ara V. ; Huang, Thomas S.
Author_Institution :
Image Formation & Process. Group, Illinois Univ., Urbana, IL, USA
Abstract :
In this paper, we investigate the use of audio and visual rather than only audio features for the task of speech separation in acoustically noisy environments. The success of existing independent component analysis (ICA) systems for the separation of a large variety of signals, including speech, is often limited by the ability of this technique to handle noise. In this paper, we introduce a Bayesian model for the mixing process that describes both the bimodality and the time dependency of speech sources. Our experimental results show that the online demixing process presented here outperforms both the ICA and the audio-only Bayesian model at all levels of noise.
Keywords :
Bayes methods; acoustic noise; feature extraction; source separation; speech processing; video signal processing; Bayesian audio-visual speech source separation; Bayesian mixing process model; ICA; acoustically noisy environments; online demixing process; speech source bimodality; speech source time dependency; visual feature extraction; Acoustic noise; Bayesian methods; Face detection; Feature extraction; Independent component analysis; Microphone arrays; Mouth; Speech enhancement; Speech processing; Working environment noise;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1327196