DocumentCode :
1749631
Title :
Weighting schemes for audio-visual fusion in speech recognition
Author :
Glotin, Hervé ; Vergyr, D. ; Neti, Chalapathy ; Potamianos, Gerasimos ; Luettin, Juergen
Author_Institution :
ICP, Grenoble, France
Volume :
1
fYear :
2001
fDate :
2001
Firstpage :
173
Abstract :
We demonstrate an improvement in the state-of-the-art large vocabulary continuous speech recognition (LVCSR) performance, under clean and noisy conditions, by the use of visual information, in addition to the traditional audio one. We take a decision fusion approach for the audio-visual information, where the single-modality (audio- and visual- only) HMM classifiers are combined to recognize audio-visual speech. More specifically, we tackle the problem of estimating the appropriate combination weights for each of the modalities. Two different techniques are described: the first uses an automatically extracted estimate of the audio stream reliability in order to modify the weights for each modality (both clean and noisy audio results are reported), while the second is a discriminative model combination approach where weights on pre-defined model classes are optimized to minimize WER (clean audio only results)
Keywords :
Gaussian distribution; audio signal processing; decision theory; hidden Markov models; sensor fusion; speech recognition; video signal processing; audio stream; audio-visual fusion; clean conditions; decision fusion approach; discriminative model combination approach; large vocabulary continuous speech recognition; noisy conditions; single-modality HMM classifiers; visual information; weighting schemes; Acoustic noise; Art; Audio databases; Automatic speech recognition; Hidden Markov models; Signal to noise ratio; Speech enhancement; Speech recognition; Streaming media; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
ISSN :
1520-6149
Print_ISBN :
0-7803-7041-4
Type :
conf
DOI :
10.1109/ICASSP.2001.940795
Filename :
940795
Link To Document :
بازگشت