Title :
Pitch-aided spectral estimation for noise-robust speech recognition
Author :
Erell, Adoram ; Weintraub, Mitch
Author_Institution :
SRI Int., Menlo Park, CA, USA
Abstract :
A method for utilizing the quasi-periodicity of speech in a minimum-mean-square-error (MMSE) estimation of the DFT log-amplitude, either for speech enhancement or for noise-robust speech recognition, is described. The estimator takes into account the periodicity by conditioning the estimate of voiced speech on the distance between the frequency of any given DFT coefficient and the nearest harmonic. The DFT estimator is also made conditional on the broadband spectrum, so that the correlation between distant frequencies is partially taken into account. The algorithm has been tested with computer-room noise using an MSE criterion for the spectral envelope, defined by Mel-scale filterbank log-energies, and in recognition experiments. The MSE for voiced speech is reduced significantly by the periodicity-conditioning. Recognition accuracy is not improved because the overwhelming majority of errors occur in unvoiced speech
Keywords :
fast Fourier transforms; noise; parameter estimation; speech recognition; DFT coefficient frequency; DFT estimator; DFT log-amplitude; FFT; MSE criterion; Mel-scale filterbank log-energies; broadband spectrum; computer-room noise; fast Fourier transform; minimum mean square error estimation; noise-robust speech recognition; periodicity-conditioning; pitch aided spectral estimation; spectral envelope; speech enhancement; speech quasi-periodicity; voiced speech; Cities and towns; Filter bank; Frequency estimation; Noise robustness; Power harmonic filters; Speech enhancement; Speech processing; Speech recognition; Stochastic processes; Testing;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference on
Conference_Location :
Toronto, Ont.
Print_ISBN :
0-7803-0003-3
DOI :
10.1109/ICASSP.1991.150487