DocumentCode :
3349259
Title :
Improved face and feature finding for audio-visual speech recognition in visually challenging environments
Author :
Jiang, Jintao ; Potamianos, Gerasimos ; Nock, Harriet ; Iyengar, Giridharan ; Neti, Chalapathy
Author_Institution :
Dept. of Electr. Eng., California Univ., Los Angeles, CA, USA
Volume :
5
fYear :
2004
fDate :
17-21 May 2004
Abstract :
Visual information in a speaker\´s face is known to improve the robustness of automatic speech recognition (ASR). However, most studies in audio-visual ASR have focused on "visually clean" data to benefit ASR in noise. This paper is a follow up on a previous study that investigated audio-visual ASR in visually challenging environments. It focuses on visual speech front end processing, and it proposes an improved, appearance based face and feature detection algorithm that utilizes Gaussian mixture model classifiers. This method is shown to improve the accuracy of face and feature detection, and thus visual speech recognition, over our previously used baseline system. In turn, this translates to improved audio-visual ASR, resulting in a 10% relative reduction of the word-error-rate in noisy speech.
Keywords :
Gaussian processes; face recognition; feature extraction; speech recognition; Gaussian mixture model classifiers; appearance based face detection; audio-visual ASR; audio-visual speech recognition; automatic speech recognizers; detection accuracy; feature detection; noisy speech word-error-rate; speaker face visual information; visual speech front end processing; visually challenging environments; Automatic speech recognition; Computer vision; Detection algorithms; Face detection; Face recognition; Facial features; Linear discriminant analysis; Speech enhancement; Speech recognition; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1327250
Filename :
1327250
Link To Document :
بازگشت