DocumentCode
918762
Title
Filterbank-energy estimation using mixture and Markov models for recognition of noisy speech
Author
Erell, Adoram ; Weintraub, Mitchel
Author_Institution
SRI Inst., Menlo Park, CA, USA
Volume
1
Issue
1
fYear
1993
fDate
1/1/1993 12:00:00 AM
Firstpage
68
Lastpage
76
Abstract
An estimation algorithm for noise robust speech recognition, the minimum mean log spectral distance (MMLSD), is presented. The estimation is matched to the recognizer by seeking to minimize the average distortion as measured by a Euclidean distance between filterbank log-energy vectors, approximating the weighted-cepstral distance used by the recognizer. The estimation is computed using a clean speech spectral probability distribution, estimated from a database, and a stationary, ARMA model for the noise. When trained on clean speech and tested with additive white noise at 10-dB SNR, the recognition accuracy with the MMLSD algorithm is comparable to that achieved with training the recognizer at the same constant 10-dB SNR. The algorithm is also highly efficient with a quasi-stationary environmental noise, recorded with a desktop microphone, and requires almost no tuning to differences between this noise and the computer-generated white noise
Keywords
Markov processes; filtering and prediction theory; speech recognition; white noise; 10 dB; ARMA model; Euclidean distance; MMLSD algorithm; Markov models; SNR; additive white noise; average distortion; clean speech; computer-generated white noise; database; desktop microphone; estimation algorithm; filterbank energy estimation; filterbank log-energy vectors; minimum mean log spectral distance algorithm; noise robust speech recognition; noisy speech; quasistationary environmental noise; recognition accuracy; speech spectral probability distribution; weighted-cepstral distance; Distortion measurement; Distributed computing; Euclidean distance; Filter bank; Noise robustness; Probability distribution; Signal to noise ratio; Speech enhancement; Speech recognition; Working environment noise;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.221385
Filename
221385
Link To Document