DocumentCode
442173
Title
Adaptive stream reliability modeling based on local dispersion measures for audio visual speech recognition
Author
Xie, Lei ; Zhao, Rong-chun ; Liu, Zhi-Qiang
Author_Institution
Center for Media Technol., City Univ. of Hong Kong, Kowloon, China
Volume
8
fYear
2005
fDate
18-21 Aug. 2005
Firstpage
4852
Abstract
This paper proposes an adaptive stream reliability modeling technique for audio visual speech recognition (AVSR). As recognition conditions vary locally, we present two local measures - frame and window dispersions to depict the temporal discriminative powers and noise levels of both audio and visual streams. The dispersions are subsequently mapped to stream exponents according to the minimum classification error (MCE) criterion. Experiments on a connected-digits task show that our method consistently outperforms the popular discriminative training (DT) and grid search (GS) methods at various signal noise ratios (SNRs), improving for example word accuracy rate (WAR) from 94.7% to 96.4% at 28dB SNR.
Keywords
audio-visual systems; noise; reliability theory; speech recognition; video streaming; adaptive stream reliability modeling; audio visual speech recognition; discriminative training method; frame dispersion measures; grid search method; local dispersion measures; minimum classification error; signal noise ratios; window dispersion measures; Acoustic noise; Cepstral analysis; Computer science; Dispersion; Hidden Markov models; Noise level; Noise measurement; Signal to noise ratio; Speech recognition; Streaming media; Lipreading; MCE-GPD; audio visual speech recognition; dispersion; stream exponents;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location
Guangzhou, China
Print_ISBN
0-7803-9091-1
Type
conf
DOI
10.1109/ICMLC.2005.1527797
Filename
1527797
Link To Document