DocumentCode :
1833135
Title :
HMM-based audio-visual speech recognition integrating geometric and appearance-based visual features
Author :
Chan, Michael T.
Author_Institution :
Rockwell Inst. Sci. Center, Thousand Oaks, CA, USA
fYear :
2001
fDate :
2001
Firstpage :
9
Lastpage :
14
Abstract :
A good front end for visual feature extraction is an important element of audio-visual speech recognition systems. We propose a new visual feature representation that combines both geometric- and pixel-based features. Using our previously developed contour-based lip-tracking algorithm, geometric features including the height and width of the lips are automatically extracted. Lip boundary tracking allows accurate determination of a region of interest from which we construct pixel-based features that are robust to variation in scale and translation. Motivated by computational considerations, we selected a subset of the pixels in the center of the inner mouth area that was found to capture sufficient details of the appearance of the teeth and tongue for assisting in the discrimination of spoken words. We show the advantage of the combination of these visual features for visual-only and audio-visual speech recognition of isolated digits
Keywords :
audio-visual systems; feature extraction; hidden Markov models; image recognition; object recognition; speech recognition; tracking; HMM-based methods; appearance-based visual features; audio-visual speech recognition; contour-based lip-tracking algorithm; geometric-based visual features; hidden Markov model-based methods; pixel-based features; region of interest; visual feature extraction; visual-only speech recognition; Active shape model; Deformable models; Feature extraction; Hidden Markov models; Lips; Mouth; Robustness; Speech recognition; Teeth; Tongue;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Signal Processing, 2001 IEEE Fourth Workshop on
Conference_Location :
Cannes
Print_ISBN :
0-7803-7025-2
Type :
conf
DOI :
10.1109/MMSP.2001.962703
Filename :
962703
Link To Document :
بازگشت