• DocumentCode
    1445211
  • Title

    Boosted Mixture Learning of Gaussian Mixture Hidden Markov Models Based on Maximum Likelihood for Speech Recognition

  • Author

    Du, Jun ; Hu, Yu ; Jiang, Hui

  • Author_Institution
    iFlytek Res., Hefei, China
  • Volume
    19
  • Issue
    7
  • fYear
    2011
  • Firstpage
    2091
  • Lastpage
    2100
  • Abstract
    In this paper, we apply the well-known boosted mixture learning (BML) method to learn Gaussian mixture HMMs in speech recognition. BML is an incremental method to learn mixture models for classification problems. In each step of BML, one new mixture component is estimated according to the functional gradient of an objective function to ensure that it is added along the direction that maximizes the objective function. Several techniques have been proposed to extend BML from simple mixture models like the Gaussian mixture model (GMM) to the Gaussian mixture hidden Markov model (HMM), including Viterbi approximation for state segmentation, weight decay and sampling boosting to initialize sample weights to avoid overfitting, combination between partial updating and global updating to refine model parameters in each BML iteration, and use of the Bayesian Information Criterion (BIC) for parsimonious modeling. Experimental results on two large-vocabulary continuous speech recognition tasks, namely the WSJ-5k and Switchboard tasks, have shown that the proposed BML yields significant performance gain over the conventional training procedure, especially for small model sizes.
  • Keywords
    Gaussian processes; hidden Markov models; learning (artificial intelligence); maximum likelihood estimation; speech recognition; BIC; BML iteration; BML method; Bayesian information criterion; GMM; Gaussian mixture HMM; Gaussian mixture hidden Markov models; Gaussian mixture model; Viterbi approximation; WSJ-5k; boosted mixture learning method; classification problems; large-vocabulary continuous speech recognition; maximum likelihood; sampling boosting; speech recognition; state segmentation; switchboard tasks; weight decay; Boosting; Hidden Markov models; Maximum likelihood estimation; Speech recognition; Training; Viterbi algorithm; Boosted mixture learning (BML); boosting; functional gradient; speech recognition;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2011.2112352
  • Filename
    5710403