Title :
Visual speech features representation for automatic lip-reading
Author :
Sagheer, Alaa ; Tsuruta, Naoyuki ; Taniguchi, Rin-Ichiro ; Maeda, Sakashi
Author_Institution :
Dept. of Intelligent Syst., Kyushu Univ., Japan
Abstract :
A fundamental task in the pattern recognition field is to find a suitable representation for a feature. We present a new visual speech feature representation approach that combines hypercolumn model (HCM) with HMM to perform a complete lip-reading system. We use HCM to extract visual speech features from the input image. The extracted features are modeled by Gaussian distributions using HMM. The proposed lip-reading system can work under varying lip positions and sizes. All images were captured in a natural environment without using special lighting or lip markers. Experimental results are shown to compare favourably with the results of two reported systems, SOM and DCT based systems. HCM provides better performance than both of these systems.
Keywords :
Gaussian distribution; feature extraction; gesture recognition; hidden Markov models; image representation; speech recognition; Gaussian distributions; HMM; automatic lip-reading; hypercolumn model; pattern recognition; visual feature extraction; visual speech feature representation; visual speech recognition; Discrete cosine transforms; Discrete wavelet transforms; Feature extraction; Gaussian distribution; Hidden Markov models; Image recognition; Mouth; Neural networks; Shape measurement; Speech recognition;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
Print_ISBN :
0-7803-8874-7
DOI :
10.1109/ICASSP.2005.1415521