DocumentCode
1135702
Title
Phonemic hidden Markov models with continuous mixture output densities for large vocabulary word recognition
Author
Deng, L. ; Kenny, P. ; Lennig, M. ; Gupta, V. ; Seitz, F. ; Mermelstein, P.
Author_Institution
INRS-Telecommun., Montreal, Que., Canada
Volume
39
Issue
7
fYear
1991
fDate
7/1/1991 12:00:00 AM
Firstpage
1677
Lastpage
1681
Abstract
The authors demonstrate the effectiveness of phonemic hidden Markov models with Gaussian mixture output densities (mixture HMMs) for speaker-dependent large-vocabulary word recognition. Speech recognition experiments show that for almost any reasonable amount of training data, recognizers using mixture HMMs consistently outperform those employing unimodal Gaussian HMMs. With a sufficiently large training set (e.g. more than 2500 words), use of HMMs with 25-component mixture distributions typically reduces recognition errors by about 40%. It is also found that the mixture HMMs outperform a set of unimodal generalized triphone models having the same number of parameters. Previous attempts to employ mixture HMMs for speech recognition proved discouraging because of the high complexity and computational cost in implementing the Baum-Welch training algorithm. It is shown how mixture HMMs can be implemented very simply in unimodal transition-based frameworks by allowing multiple transitions from one state to another
Keywords
Markov processes; speech recognition; Baum-Welch training algorithm; continuous mixture output densities; large vocabulary word recognition; mixture HMM; multiple transitions; phonemic hidden Markov models; recognition errors; speaker dependent recognition; speech recognition; training set; unimodal Gaussian HMM; unimodal generalized triphone models; unimodal transition-based frameworks; Adaptive signal processing; Hidden Markov models; Least squares approximation; Noise cancellation; Signal processing algorithms; Speech processing; Speech recognition; Transfer functions; Vocabulary; Zinc;
fLanguage
English
Journal_Title
Signal Processing, IEEE Transactions on
Publisher
ieee
ISSN
1053-587X
Type
jour
DOI
10.1109/78.134406
Filename
134406
Link To Document