DocumentCode
312016
Title
Optimal tying of HMM mixture densities using decision trees
Author
Boulianne, Gilles ; Kenny, Patrick
Author_Institution
Spoken Word Technol., Montreal, Que., Canada
Volume
1
fYear
1996
fDate
3-6 Oct 1996
Firstpage
350
Abstract
The most detailed acoustic models in our two-pass speaker-independent, continuous speech recognition system are context-dependent models, which become more difficult to adequately train as the number of different contexts becomes large. Tying of model parameters or clustering of model densities based on bottom-up agglomerative procedures can efficiently reduce the number of parameters to train, but suffer from the additional problem of how to model untrained contexts. Top-down clustering with a decision tree can provide well-trained models for any context, whether seen or unseen in training. Trees are built from a root node that is successively split by selecting, among questions about phonetic context, one that provides the best segregation of data. Several goodness of split criterions have been proposed, such as Poisson-based (Bahl et al., 1991), or single Gaussian-based (Bahl et al., 1994), their choice being primarily motivated by computational considerations. We show, from maximum likelihood considerations, how to derive a computationally efficient criterion based on a different approximation using tied mixtures of Gaussian densities
Keywords
Gaussian processes; decision theory; hidden Markov models; maximum likelihood estimation; speech recognition; trees (mathematics); Gaussian densities; Gaussian-based method; HMM mixture density tying; Poisson-based method; acoustic models; bottom-up procedures; context-dependent models; continuous speech recognition system; data segregation; decision trees; goodness of split criterion; hidden Markov model; maximum likelihood estimation; model density clustering; model parameters; phonetic context; top-down clustering; training; two-pass speaker-independent recognition; Context modeling; Decision trees; Gaussian processes; Hidden Markov models; Maximum likelihood estimation; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
0-7803-3555-4
Type
conf
DOI
10.1109/ICSLP.1996.607126
Filename
607126
Link To Document