DocumentCode
2869457
Title
Intensity/location normalization for automatic lipreading
Author
Tanaka, Akiji ; Vanegas, Oscar ; Tokuda, Keiichi ; Kitamura, Tadashi
Author_Institution
Dept. of Comput. Sci., Nagoya Inst. of Technol., Japan
Volume
2
fYear
1998
fDate
1998
Firstpage
920
Abstract
This paper describes intensity and location normalization for the improvement of the performance of a speech recognition system by using the visual information in bimodal speech recognition. In conventional speech recognition, many methods have been proposed for normalization of channel characteristics and speaker individuality. In this study, two methods similar to CMN and SAT were proposed for the intensity and location normalization respectively, and two kinds of feature vectors, a subsampled image and a 2D-DCT were compared. Experimental results show that the recognition rates have been very much improved by the normalization techniques
Keywords
discrete cosine transforms; feature extraction; image recognition; image sampling; image sequences; speech recognition; 2D-DCT; automatic lipreading; bimodal speech recognition; channel characteristic normalization; feature vectors; image sequences; intensity/location normalization; recognition rate; speaker individuality; speech recognition system; subsampled image; visual information; Cepstral analysis; Character recognition; Computer science; Discrete cosine transforms; Error analysis; Image recognition; Image sequences; Lips; Robustness; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Proceedings, 1998. ICSP '98. 1998 Fourth International Conference on
Conference_Location
Beijing
Print_ISBN
0-7803-4325-5
Type
conf
DOI
10.1109/ICOSP.1998.770762
Filename
770762
Link To Document