DocumentCode :
431556
Title :
Visual speech features representation for automatic lip-reading
Author :
Sagheer, Alaa ; Tsuruta, Naoyuki ; Taniguchi, Rin-Ichiro ; Maeda, Sakashi
Author_Institution :
Dept. of Intelligent Syst., Kyushu Univ., Japan
Volume :
2
fYear :
2005
fDate :
18-23 March 2005
Abstract :
A fundamental task in the pattern recognition field is to find a suitable representation for a feature. We present a new visual speech feature representation approach that combines hypercolumn model (HCM) with HMM to perform a complete lip-reading system. We use HCM to extract visual speech features from the input image. The extracted features are modeled by Gaussian distributions using HMM. The proposed lip-reading system can work under varying lip positions and sizes. All images were captured in a natural environment without using special lighting or lip markers. Experimental results are shown to compare favourably with the results of two reported systems, SOM and DCT based systems. HCM provides better performance than both of these systems.
Keywords :
Gaussian distribution; feature extraction; gesture recognition; hidden Markov models; image representation; speech recognition; Gaussian distributions; HMM; automatic lip-reading; hypercolumn model; pattern recognition; visual feature extraction; visual speech feature representation; visual speech recognition; Discrete cosine transforms; Discrete wavelet transforms; Feature extraction; Gaussian distribution; Hidden Markov models; Image recognition; Mouth; Neural networks; Shape measurement; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8874-7
Type :
conf
DOI :
10.1109/ICASSP.2005.1415521
Filename :
1415521
Link To Document :
بازگشت