Title :
Warping and scaling of the minimum variance distortionless response
Author :
Wolfel, Matthias ; McDonough, John ; Waibel, Alex
Author_Institution :
Interactive Syst. Labs., Karlsruhe Univ., Germany
fDate :
30 Nov.-3 Dec. 2003
Abstract :
Spectral estimation based on the minimum variance distortionless response (MVDR) is well-known in the signal processing literature and has been shown to be superior to linear prediction for robust speech recognition. In this work we propose two techniques to improve the resolution and the robustness of the MVDR spectral estimate: The first is a time-domain technique to estimate an all-pole model based on the warped short time frequency axis such as the Mel-frequency. The second is a method for scaling the height of the spectral envelope in order to extract robust features for large vocabulary continuous speech recognition systems which must operate in noisy conditions. Moreover, we show that these two techniques can be combined to good effect. In a series of speech recognition experiments on the Switchboard corpus, the combination of our proposed approaches achieved a word error rate (WER) of 35.9%, which is clearly superior to the 37.0% WER obtained by the common MVDR and the 37.2% WER obtained by the widely used Fourier transform.
Keywords :
error statistics; feature extraction; poles and zeros; signal resolution; spectral analysis; speech recognition; time-domain analysis; MVDR; Mel-frequency; Switchboard corpus; WER; all-pole model; height scaling; large vocabulary continuous speech recognition; minimum variance distortionless response; robust feature extraction; robust speech recognition; signal resolution; spectral envelope; spectral estimation; time-domain technique; warped short time frequency axis; word error rate; Distortion; Feature extraction; Frequency estimation; Robustness; Signal processing; Signal resolution; Speech recognition; Time domain analysis; Time frequency analysis; Vocabulary;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318472