A combination of discriminative and maximum likelihood techniques for noise robust speech recognition

Author

Laurila, Kari ; Vasilache, Marcel ; Viikki, Olli

Author_Institution

Speech & Audio Syst. Lab., Nokia Res. Center, Tampere, Finland

Volume

1

fYear

1998

fDate

12-15 May 1998

Firstpage

85

Abstract

We study how discriminative and maximum likelihood (ML) techniques should be combined in order to maximize the recognition accuracy of a speaker-independent automatic speech recognition (ASR) system that includes speaker adaptation. We compare two training approaches for the speaker-independent case and examine how well they perform together with four different speaker adaptation schemes. In a noise robust connected digit recognition task we show that the minimum classification error (MCE) training approach for speaker-independent modelling together with the Bayesian speaker adaptation scheme provide the highest classification accuracy over the whole lifespan of an ASR system. With the MCE training we are capable of reducing the recognition errors by 30% over the ML approach in the speaker-independent case. With the Bayesian speaker adaptation scheme we can further reduce the error rates by 62% using only as few as five adaptation utterances

Keywords

Bayes methods; error statistics; hidden Markov models; maximum likelihood estimation; noise; pattern classification; speech recognition; Bayesian speaker adaptation; HMM; MCE training; ML approach; adaptation utterances; car environment; classification accuracy; connected digit recognition task; discriminative techniques; error rate reduction; hands-free voice dialling; maximum likelihood techniques; minimum classification error training; noise robust speech recognition; recognition accuracy; recognition error reduction; speaker-independent automatic speech recognition; Automatic speech recognition; Bayesian methods; Hidden Markov models; Maximum likelihood estimation; Noise robustness; Speech enhancement; Speech recognition; Target recognition; Training data; Working environment noise;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location

Seattle, WA

ISSN

1520-6149

Print_ISBN

0-7803-4428-6

Type

conf

DOI

10.1109/ICASSP.1998.674373

Filename

674373