DocumentCode
1749628
Title
Optimal weighting of posteriors for audio-visual speech recognition
Author
Heckmann, Martin ; Berthommier, Frédéric ; Kroschel, Kristian
Author_Institution
Inst. de la Commuinication Parlee, Inst. Nat. Polytech. de Grenoble, France
Volume
1
fYear
2001
fDate
2001
Firstpage
161
Abstract
We investigate the fusion of audio and video a posteriori phonetic probabilities in a hybrid ANN/HMM audio-visual speech recognition system. Three basic conditions to the fusion process are stated and implemented in a linear and a geometric weighting scheme. These conditions are the assumption of conditional independence of the audio and video data and the contribution of only one of the two paths when the SNR is very high or very low, respectively. In the case of the geometric weighting a new weighting scheme is developed whereas the linear weighting follows the full combination approach as employed in multi-stream recognition. We compare these two new concepts in audio-visual recognition to a rather standard approach known from the literature. Recognition tests were performed in a continuous number recognition task on a single speaker database containing 1712 utterances with two different types of noise added
Keywords
Gaussian noise; audio signal processing; hidden Markov models; neural nets; probability; sensor fusion; speech recognition; video signal processing; white noise; a posteriori phonetic probabilities; audio-visual speech recognition; continuous number recognition task; full combination approach; geometric weighting scheme; hybrid ANN/HMM system; linear weighting scheme; optimal weighting; posteriors; single speaker database; Acoustic noise; Audio databases; Feature extraction; Hidden Markov models; Lips; Performance evaluation; Spatial databases; Speech recognition; Streaming media; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location
Salt Lake City, UT
ISSN
1520-6149
Print_ISBN
0-7803-7041-4
Type
conf
DOI
10.1109/ICASSP.2001.940792
Filename
940792
Link To Document