On the application of hidden Markov models for enhancing noisy speech

Author

Ephraim, Yariv ; Malah, David ; Juang, Bing-Hwang

Author_Institution

AT&T Bell Lab., Murray Hill, NJ, USA

Volume

37

Issue

12

fYear

1989

fDate

12/1/1989 12:00:00 AM

Firstpage

1846

Lastpage

1856

Abstract

A maximum-a-posteriori approach for enhancing speech signals which have been degraded by statistically independent additive noise is proposed. The approach is based on statistical modeling of the clean speech signal and the noise process using long training sequences from the two processes. Hidden Markov models (HMMs) with mixtures of Gaussian autoregressive (AR) output probability distributions (PDs) are used to model the clean speech signal. The model for the noise process depends on its nature. The parameter set of the HMM model is estimated using the Baum or the EM (estimation-maximization) algorithm. The noisy speech is enhanced by reestimating the clean speech waveform using the EM algorithm. Efficient approximations of the training and enhancement procedures are examined. This results in the segmental k-means approach for hidden Markov modeling, in which the state sequence and the parameter set of the model are alternately estimated. Similarly, the enhancement is done by alternate estimation of the state and observation sequences. An approximate improvement of 4.0-6.0 dB in signal-to-noise ratio (SNR) is achieved at 10-dB input SNR

Keywords

Markov processes; speech analysis and processing; Gaussian autoregressive; additive noise; estimation-maximization; hidden Markov models; maximum-a-posteriori approach; probability distributions; segmental k-means approach; speech signals; statistical modeling; training sequences; Additive noise; Degradation; Distortion measurement; Hidden Markov models; Signal processing; Signal to noise ratio; Speech analysis; Speech enhancement; Speech processing; State estimation;

fLanguage

English

Journal_Title

Acoustics, Speech and Signal Processing, IEEE Transactions on

Publisher

ieee

ISSN

0096-3518

Type

jour

DOI

10.1109/29.45532

Filename

45532