DocumentCode :
2892107
Title :
Boosted audio-visual HMM for speech reading
Author :
Yin, Pei ; Essa, Irfan ; Rehg, James M.
Author_Institution :
GVU Center, Georgia Inst. of Technol., Atlanta, GA, USA
Volume :
2
fYear :
2003
fDate :
9-12 Nov. 2003
Firstpage :
2013
Abstract :
We propose a new approach for combining acoustic and visual measurements to aid in recognizing lip shapes of a person speaking. Our method relies on computing the maximum likelihoods of (a) HMM used to model phonemes from the acoustic signal, and (b) HMM used to model visual features motions from video. One significant addition in this work is the dynamic analysis with features selected by AdaBoost, on the basis of their discriminant ability. This form of integration, leading to boosted HMM, permits AdaBoost to find the best features first, and then uses HMM to exploit dynamic information inherent in the signal.
Keywords :
audio-visual systems; feature extraction; hidden Markov models; maximum likelihood estimation; speech processing; speech recognition; video signal processing; AdaBoost; acoustic measurement; acoustic signal; boosted audio-visual HMM; dynamic analysis; feature selection; hidden Markov model; lip shape recognition; maximum likelihood; phoneme model; speech reading; video signal; visual feature motion; visual measurement; Acoustic applications; Acoustic measurements; Educational institutions; Face detection; Hidden Markov models; Information analysis; Natural languages; Shape measurement; Signal analysis; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signals, Systems and Computers, 2004. Conference Record of the Thirty-Seventh Asilomar Conference on
Print_ISBN :
0-7803-8104-1
Type :
conf
DOI :
10.1109/ACSSC.2003.1292334
Filename :
1292334
Link To Document :
بازگشت