DocumentCode
1597458
Title
Improving speech analysis methods for robust automatic recognition
Author
Shaughnessy, Douglas O.
Author_Institution
Telecommun., INKS-EMT, Montreal, Que., Canada
Volume
1
fYear
2004
Firstpage
161
Abstract
Automatic speech recognition suffers from lower performance in noisy conditions, partly owing to the nature of the commonly-used mel-frequency cepstrum. The cepstrum is sensitive to changes in the speech environment (e.g., noise or different speakers; mismatched conditions when speech used to train a recognizer differs significantly from that used in tests). The cepstrum does not distinguish well between high and low amplitude portions of the speech spectrum. Peak-based measures, on the other hand, can tolerate additive noise much better, as the main spectral peaks remain above the noise level. In this paper, we analyze the limitations of the cepstrum, and demonstrate its weaknesses. We compare it to simple spectral peak-based measures, and show that such methods can be both more efficient and more robust. Such latter methods are computationally less intensive than the cepstrum, while also remaining more reliable under various noise conditions.
Keywords
cepstral analysis; speech processing; speech recognition; additive noise; automatic speech recognition; mel-frequency cepstral coefficients; mel-frequency cepstrum; noisy speech environment; robust ASR system; spectral peak-based methods; speech analysis methods; speech spectrum high amplitude portions; speech spectrum low amplitude portions; Additive noise; Automatic speech recognition; Cepstrum; Noise measurement; Noise robustness; Speech analysis; Speech enhancement; Speech recognition; Testing; Working environment noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Electrical and Computer Engineering, 2004. Canadian Conference on
ISSN
0840-7789
Print_ISBN
0-7803-8253-6
Type
conf
DOI
10.1109/CCECE.2004.1344981
Filename
1344981
Link To Document