Title :
Adaptive stream reliability modeling based on local dispersion measures for audio visual speech recognition
Author :
Xie, Lei ; Zhao, Rong-chun ; Liu, Zhi-Qiang
Author_Institution :
Center for Media Technol., City Univ. of Hong Kong, Kowloon, China
Abstract :
This paper proposes an adaptive stream reliability modeling technique for audio visual speech recognition (AVSR). As recognition conditions vary locally, we present two local measures - frame and window dispersions to depict the temporal discriminative powers and noise levels of both audio and visual streams. The dispersions are subsequently mapped to stream exponents according to the minimum classification error (MCE) criterion. Experiments on a connected-digits task show that our method consistently outperforms the popular discriminative training (DT) and grid search (GS) methods at various signal noise ratios (SNRs), improving for example word accuracy rate (WAR) from 94.7% to 96.4% at 28dB SNR.
Keywords :
audio-visual systems; noise; reliability theory; speech recognition; video streaming; adaptive stream reliability modeling; audio visual speech recognition; discriminative training method; frame dispersion measures; grid search method; local dispersion measures; minimum classification error; signal noise ratios; window dispersion measures; Acoustic noise; Cepstral analysis; Computer science; Dispersion; Hidden Markov models; Noise level; Noise measurement; Signal to noise ratio; Speech recognition; Streaming media; Lipreading; MCE-GPD; audio visual speech recognition; dispersion; stream exponents;
Conference_Titel :
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location :
Guangzhou, China
Print_ISBN :
0-7803-9091-1
DOI :
10.1109/ICMLC.2005.1527797