A linear predictive front-end processor for speech recognition in noisy environments

Author

Ephraim, Yariv ; Wilpon, Jay G. ; Rabiner, Lawrence R.

Author_Institution

AT&T Bell Laboratories, Murray Hill, NJ

Volume

12

fYear

1987

fDate

31868

Firstpage

1324

Lastpage

1327

Abstract

We investigate the performance of a recent algorithm for linear predictive (LP) modeling of speech signals, which have been degraded by uncorrelated additive noise, as a front-end processor in a speech recognition system. The system is speaker dependent, and recognizes isolated words, based on dynamic time warping principles. The LP model for the clean speech is estimated through appropriate composite modeling of the noisy speech. This is done by minimizing the Itakura-Saito distortion measure between the sample spectrum of the noisy speech and the power spectral density of the composite model. This approach results in a "filtering-modeling" scheme in which the filter for the noisy speech, and the LP model for the clean speech, are alternatively optimized. The proposed system was tested using the 26 word English alphabet, the ten English digits, and the three command words, "stop," "error," and "repeat," which were contaminated by additive white noise at 5-20 dB signal to noise ratios (SNR\´s). By replacing the standard LP analysis with the proposed algorithm, during training on the clean speech and testing on the noisy speech, we achieve an improvement in recognition accuracy equivalent to an increase in input SNR of approximately 10 dB.

Keywords

Distortion measurement; Power system modeling; Prediction algorithms; Predictive models; Signal to noise ratio; Speech analysis; Speech enhancement; Speech processing; Speech recognition; Working environment noise;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '87.

Type

conf

DOI

10.1109/ICASSP.1987.1169458

Filename

1169458