Title :
Improving speech analysis methods for robust automatic recognition
Author :
Shaughnessy, Douglas O.
Author_Institution :
Telecommun., INKS-EMT, Montreal, Que., Canada
Abstract :
Automatic speech recognition suffers from lower performance in noisy conditions, partly owing to the nature of the commonly-used mel-frequency cepstrum. The cepstrum is sensitive to changes in the speech environment (e.g., noise or different speakers; mismatched conditions when speech used to train a recognizer differs significantly from that used in tests). The cepstrum does not distinguish well between high and low amplitude portions of the speech spectrum. Peak-based measures, on the other hand, can tolerate additive noise much better, as the main spectral peaks remain above the noise level. In this paper, we analyze the limitations of the cepstrum, and demonstrate its weaknesses. We compare it to simple spectral peak-based measures, and show that such methods can be both more efficient and more robust. Such latter methods are computationally less intensive than the cepstrum, while also remaining more reliable under various noise conditions.
Keywords :
cepstral analysis; speech processing; speech recognition; additive noise; automatic speech recognition; mel-frequency cepstral coefficients; mel-frequency cepstrum; noisy speech environment; robust ASR system; spectral peak-based methods; speech analysis methods; speech spectrum high amplitude portions; speech spectrum low amplitude portions; Additive noise; Automatic speech recognition; Cepstrum; Noise measurement; Noise robustness; Speech analysis; Speech enhancement; Speech recognition; Testing; Working environment noise;
Conference_Titel :
Electrical and Computer Engineering, 2004. Canadian Conference on
Print_ISBN :
0-7803-8253-6
DOI :
10.1109/CCECE.2004.1344981