DocumentCode :
1172787
Title :
Pushing the envelope - aside [speech recognition]
Author :
Morgan, Nelson ; Zhu, Qifeng ; Stolcke, Andreas ; Sönmez, Kemal ; Sivadas, Sunil ; Shinozaki, Takahiro ; Ostendorf, Mari ; Jain, Pratibha ; Hermansky, Hynek ; Ellis, Dan ; Doddington, George ; Chen, Barry ; Cretin, O. ; Bourlard, Hervé ; Athineos, Marios
Volume :
22
Issue :
5
fYear :
2005
Firstpage :
81
Lastpage :
88
Abstract :
Despite successes, there are still significant limitations to speech recognition performance, particularly for conversational speech and/or for speech with significant acoustic degradations from noise or reverberation. For this reason, authors have proposed methods that incorporate different (and larger) analysis windows, which are described in this article. Note in passing that we and many others have already taken advantage of processing techniques that incorporate information over long time ranges, for instance for normalization (by cepstral mean subtraction as stated in B. Atal (1974) or relative spectral analysis (RASTA) based in H. Hermansky and N. Morgan (1994)). They also have proposed features that are based on speech sound class posterior probabilities, which have good properties for both classification and stream combination.
Keywords :
acoustic noise; probability; spectral analysis; speech recognition; acoustic degradations; acoustic noise; acoustic reverberation; cepstral mean subtraction; conversational speech; relative spectral analysis; sound class posterior probabilities; speech recognition; Acoustic applications; Automatic speech recognition; Cepstral analysis; Deafness; Error analysis; Humans; Power system modeling; Speech recognition; Telephony; Training data;
fLanguage :
English
Journal_Title :
Signal Processing Magazine, IEEE
Publisher :
ieee
ISSN :
1053-5888
Type :
jour
DOI :
10.1109/MSP.2005.1511826
Filename :
1511826
Link To Document :
بازگشت