On the application of hidden Markov models for enhancing noisy speech

Author

Ephraim, Yariv ; Malah, David ; Juang, Biing-hwang

Author_Institution

AT&T Bell Lab., Murray Hill, NJ, USA

fYear

1988

fDate

11-14 Apr 1988

Firstpage

533

Abstract

An algorithm is proposed for enhancing noisy speech which has been degraded by statistically independent additive noise. The algorithm is based on modeling the clean speech as a hidden Markov process with mixtures of Gaussian autoregressive (AR) output processes and modeling the noise as a sequence of stationary, statistically independent, Gaussian AR vectors. The parameter sets of the models are estimated using training sequences from the clean speech and the noise process. The parameter set of the hidden Markov model is estimated by the segmental k-means algorithm. Given the estimated models, the enhancement of the noisy speech is done by alternate maximization of the likelihood function of the noisy speech, one over all sequences of states and mixture components assuming that the clean speech signal is given, and then over all vectors of the original speech using the resulting most probable sequence of states and mixture components. This alternating maximization is equivalent to first estimating the most probable sequence of AR models for the speech signal using the Viterbi algorithm, and then applying these AR models for constructing a sequence of Wiener filters which are used to enhance the noisy speech

Keywords

Markov processes; filtering and prediction theory; random noise; speech analysis and processing; AR models; Gaussian AR vectors; Gaussian autoregressive output processes; Viterbi algorithm; Wiener filters; additive noise; clean speech signal; hidden Markov models; hidden Markov process; likelihood function; maximization; noisy speech; segmental k-means algorithm; speech analysis; speech processing; training sequences; Additive noise; Degradation; Hidden Markov models; Iterative algorithms; Maximum likelihood estimation; Parameter estimation; Signal processing; Speech enhancement; Speech processing; State estimation;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on

Conference_Location

New York, NY

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.1988.196638

Filename

196638