DocumentCode :
1445211
Title :
Boosted Mixture Learning of Gaussian Mixture Hidden Markov Models Based on Maximum Likelihood for Speech Recognition
Author :
Du, Jun ; Hu, Yu ; Jiang, Hui
Author_Institution :
iFlytek Res., Hefei, China
Volume :
19
Issue :
7
fYear :
2011
Firstpage :
2091
Lastpage :
2100
Abstract :
In this paper, we apply the well-known boosted mixture learning (BML) method to learn Gaussian mixture HMMs in speech recognition. BML is an incremental method to learn mixture models for classification problems. In each step of BML, one new mixture component is estimated according to the functional gradient of an objective function to ensure that it is added along the direction that maximizes the objective function. Several techniques have been proposed to extend BML from simple mixture models like the Gaussian mixture model (GMM) to the Gaussian mixture hidden Markov model (HMM), including Viterbi approximation for state segmentation, weight decay and sampling boosting to initialize sample weights to avoid overfitting, combination between partial updating and global updating to refine model parameters in each BML iteration, and use of the Bayesian Information Criterion (BIC) for parsimonious modeling. Experimental results on two large-vocabulary continuous speech recognition tasks, namely the WSJ-5k and Switchboard tasks, have shown that the proposed BML yields significant performance gain over the conventional training procedure, especially for small model sizes.
Keywords :
Gaussian processes; hidden Markov models; learning (artificial intelligence); maximum likelihood estimation; speech recognition; BIC; BML iteration; BML method; Bayesian information criterion; GMM; Gaussian mixture HMM; Gaussian mixture hidden Markov models; Gaussian mixture model; Viterbi approximation; WSJ-5k; boosted mixture learning method; classification problems; large-vocabulary continuous speech recognition; maximum likelihood; sampling boosting; speech recognition; state segmentation; switchboard tasks; weight decay; Boosting; Hidden Markov models; Maximum likelihood estimation; Speech recognition; Training; Viterbi algorithm; Boosted mixture learning (BML); boosting; functional gradient; speech recognition;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2011.2112352
Filename :
5710403
Link To Document :
بازگشت