DocumentCode :
3348372
Title :
Bayesian separation of audio-visual speech sources
Author :
Rajaram, Shyamsundar ; Nefian, Ara V. ; Huang, Thomas S.
Author_Institution :
Image Formation & Process. Group, Illinois Univ., Urbana, IL, USA
Volume :
5
fYear :
2004
fDate :
17-21 May 2004
Abstract :
In this paper, we investigate the use of audio and visual rather than only audio features for the task of speech separation in acoustically noisy environments. The success of existing independent component analysis (ICA) systems for the separation of a large variety of signals, including speech, is often limited by the ability of this technique to handle noise. In this paper, we introduce a Bayesian model for the mixing process that describes both the bimodality and the time dependency of speech sources. Our experimental results show that the online demixing process presented here outperforms both the ICA and the audio-only Bayesian model at all levels of noise.
Keywords :
Bayes methods; acoustic noise; feature extraction; source separation; speech processing; video signal processing; Bayesian audio-visual speech source separation; Bayesian mixing process model; ICA; acoustically noisy environments; online demixing process; speech source bimodality; speech source time dependency; visual feature extraction; Acoustic noise; Bayesian methods; Face detection; Feature extraction; Independent component analysis; Microphone arrays; Mouth; Speech enhancement; Speech processing; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1327196
Filename :
1327196
Link To Document :
بازگشت