مرکز منطقه ای اطلاع رساني علوم و فناوري - Boosted Mixture Learning of Gaussian Mixture Hidden Markov Models Based on Maximum Likelihood for Speech Recognition

DocumentCode :

1445211

Title :

Boosted Mixture Learning of Gaussian Mixture Hidden Markov Models Based on Maximum Likelihood for Speech Recognition

Author :

Du, Jun ; Hu, Yu ; Jiang, Hui

Author_Institution :

iFlytek Res., Hefei, China

Volume :

Issue :

fYear :

2011

Firstpage :

2091

Lastpage :

2100

Abstract :

In this paper, we apply the well-known boosted mixture learning (BML) method to learn Gaussian mixture HMMs in speech recognition. BML is an incremental method to learn mixture models for classification problems. In each step of BML, one new mixture component is estimated according to the functional gradient of an objective function to ensure that it is added along the direction that maximizes the objective function. Several techniques have been proposed to extend BML from simple mixture models like the Gaussian mixture model (GMM) to the Gaussian mixture hidden Markov model (HMM), including Viterbi approximation for state segmentation, weight decay and sampling boosting to initialize sample weights to avoid overfitting, combination between partial updating and global updating to refine model parameters in each BML iteration, and use of the Bayesian Information Criterion (BIC) for parsimonious modeling. Experimental results on two large-vocabulary continuous speech recognition tasks, namely the WSJ-5k and Switchboard tasks, have shown that the proposed BML yields significant performance gain over the conventional training procedure, especially for small model sizes.

Keywords :

Gaussian processes; hidden Markov models; learning (artificial intelligence); maximum likelihood estimation; speech recognition; BIC; BML iteration; BML method; Bayesian information criterion; GMM; Gaussian mixture HMM; Gaussian mixture hidden Markov models; Gaussian mixture model; Viterbi approximation; WSJ-5k; boosted mixture learning method; classification problems; large-vocabulary continuous speech recognition; maximum likelihood; sampling boosting; speech recognition; state segmentation; switchboard tasks; weight decay; Boosting; Hidden Markov models; Maximum likelihood estimation; Speech recognition; Training; Viterbi algorithm; Boosted mixture learning (BML); boosting; functional gradient; speech recognition;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2011.2112352

Filename :

5710403

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1445211