Title :
Optimal tying of HMM mixture densities using decision trees
Author :
Boulianne, Gilles ; Kenny, Patrick
Author_Institution :
Spoken Word Technol., Montreal, Que., Canada
Abstract :
The most detailed acoustic models in our two-pass speaker-independent, continuous speech recognition system are context-dependent models, which become more difficult to adequately train as the number of different contexts becomes large. Tying of model parameters or clustering of model densities based on bottom-up agglomerative procedures can efficiently reduce the number of parameters to train, but suffer from the additional problem of how to model untrained contexts. Top-down clustering with a decision tree can provide well-trained models for any context, whether seen or unseen in training. Trees are built from a root node that is successively split by selecting, among questions about phonetic context, one that provides the best segregation of data. Several goodness of split criterions have been proposed, such as Poisson-based (Bahl et al., 1991), or single Gaussian-based (Bahl et al., 1994), their choice being primarily motivated by computational considerations. We show, from maximum likelihood considerations, how to derive a computationally efficient criterion based on a different approximation using tied mixtures of Gaussian densities
Keywords :
Gaussian processes; decision theory; hidden Markov models; maximum likelihood estimation; speech recognition; trees (mathematics); Gaussian densities; Gaussian-based method; HMM mixture density tying; Poisson-based method; acoustic models; bottom-up procedures; context-dependent models; continuous speech recognition system; data segregation; decision trees; goodness of split criterion; hidden Markov model; maximum likelihood estimation; model density clustering; model parameters; phonetic context; top-down clustering; training; two-pass speaker-independent recognition; Context modeling; Decision trees; Gaussian processes; Hidden Markov models; Maximum likelihood estimation; Speech recognition;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607126