• DocumentCode
    779877
  • Title

    A Study of Variable-Parameter Gaussian Mixture Hidden Markov Modeling for Noisy Speech Recognition

  • Author

    Cui, Xiaodong ; Gong, Yifan

  • Author_Institution
    Dept. of Electr. Eng., California Univ., Los Angeles, CA
  • Volume
    15
  • Issue
    4
  • fYear
    2007
  • fDate
    5/1/2007 12:00:00 AM
  • Firstpage
    1366
  • Lastpage
    1376
  • Abstract
    To improve recognition performance in noisy environments, multicondition training is usually applied in which speech signals corrupted by a variety of noise are used in acoustic model training. Published hidden Markov modeling of speech uses multiple Gaussian distributions to cover the spread of the speech distribution caused by noise, which distracts the modeling of speech event itself and possibly sacrifices the performance on clean speech. In this paper, we propose a novel approach which extends the conventional Gaussian mixture hidden Markov model (GMHMM) by modeling state emission parameters (mean and variance) as a polynomial function of a continuous environment-dependent variable. At the recognition time, a set of HMMs specific to the given value of the environment variable is instantiated and used for recognition. The maximum-likelihood (ML) estimation of the polynomial functions of the proposed variable-parameter GMHMM is given within the expectation-maximization (EM) framework. Experiments on the Aurora 2 database show significant improvements of the variable-parameter Gaussian mixture HMMs compared to the conventional GMHMMs
  • Keywords
    Gaussian processes; expectation-maximisation algorithm; hidden Markov models; polynomials; speech recognition; acoustic model training; expectation-maximization framework; maximum-likelihood estimation; noisy speech recognition; polynomial functions; speech distribution; speech signals; state emission parameters; variable-parameter Gaussian mixture hidden Markov modeling; Acoustic noise; Databases; Gaussian distribution; Gaussian noise; Hidden Markov models; Maximum likelihood estimation; Polynomials; Speech enhancement; Speech recognition; Working environment noise; Hidden Markov model; noise robust speech recognition; polynomial regression; variable parameter;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2006.889791
  • Filename
    4156190