Adaptation of HMMS in the presence of additive and convolutional noise

Author

Hirsch, Hans-Günter

Author_Institution

Ericsson Eurolab Deutschland GmbH, Nuremberg, Germany

fYear

1997

fDate

14-17 Dec 1997

Firstpage

412

Lastpage

419

Abstract

The performance of speech recognizers deteriorates in the case of a mismatch between the conditions during training and recognition. One difference is the presence of a stationary background noise during recognition which is also referred to as additive noise. Furthermore the recognition is influenced by the frequency response of the whole transmission channel from the speaker to the audio input of the recognizer. The term convolutional noise has been introduced for this type of distortion. Several approaches are known to compensate these effects individually or both together (Gales and Young, 1996). This paper describes an approach which compensates both types of noise. The scheme is based on an estimation of the noise spectrum (Hirsch and Ehrlicher, 1995). Furthermore the frequency response is iteratively estimated by using the alignment information of the best path in the Viterbi algorithm. The comparison between the spectra of the input signal and the spectra of the corresponding HMM (hidden Markov model) states is taken as basis for the filter estimation. The estimated additive and convolutional noise components are used as input to the well known Parallel Model Combination (PMC) approach (Gales, 1995) to adapt the whole word HMMs of a speaker independent connected word recognizer. Considerable improvements can be achieved in the presence of just one type of noise as well as in the presence of both types together

Keywords

convolution; hidden Markov models; noise; performance evaluation; spectral analysis; speech recognition; Parallel Model Combination method; Viterbi algorithm; additive noise; audio input; convolutional noise; frequency response; hidden Markov model; noise compensation; noise spectrum estimation; performance; speaker independent recognizer; spectra; speech recognition; stationary background noise; training; transmission channel; Additive noise; Background noise; Convolution; Filters; Frequency estimation; Frequency response; Hidden Markov models; Speech recognition; State estimation; Viterbi algorithm;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on

Conference_Location

Santa Barbara, CA

Print_ISBN

0-7803-3698-4

Type

conf

DOI

10.1109/ASRU.1997.659118

Filename

659118