Title :
Lipreading from color motion video
Author :
Chiou, Greg I. ; Hwang, Jenq-Neng
Author_Institution :
Dept. of Electr. Eng., Washington Univ., Seattle, WA, USA
Abstract :
We have designed and implemented a lipreading system which recognises isolated words using only color motion video of human lips (without acoustic data). The lipreading system performs color motion video recognition using “snakes” (active contour models), principal component analysis (PCA), and hidden Markov models (HMM). The snake algorithm and PCA are used to extract two sets of visual features from every frame (image) in the video sequence. The snake algorithm looks for contour features in the geometric space, while PCA seeks principal components in the eigenspace. An HMM recognizer is used to train and recognise a sequence of the combined visual features. With the visual information alone, we were able to achieve 94% recognition accuracy for 10 isolated words of a single speaker without using any special marker or lipstick
Keywords :
feature extraction; hidden Markov models; image colour analysis; image sequences; motion estimation; speech recognition; video signal processing; HMM; HMM recognizer; active contour models; color motion video recognition; contour features; eigenspace; geometric space; hidden Markov models; isolated word recognition; lipreading system; principal component analysis; recognition accuracy; snake algorithm; snakes; video sequence; visual feature extraction; visual information; Acoustic noise; Active contours; Hidden Markov models; Humans; Lips; Noise cancellation; Principal component analysis; Signal processing; Speech recognition; Working environment noise;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
0-7803-3192-3
DOI :
10.1109/ICASSP.1996.545743