• DocumentCode
    312016
  • Title

    Optimal tying of HMM mixture densities using decision trees

  • Author

    Boulianne, Gilles ; Kenny, Patrick

  • Author_Institution
    Spoken Word Technol., Montreal, Que., Canada
  • Volume
    1
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    350
  • Abstract
    The most detailed acoustic models in our two-pass speaker-independent, continuous speech recognition system are context-dependent models, which become more difficult to adequately train as the number of different contexts becomes large. Tying of model parameters or clustering of model densities based on bottom-up agglomerative procedures can efficiently reduce the number of parameters to train, but suffer from the additional problem of how to model untrained contexts. Top-down clustering with a decision tree can provide well-trained models for any context, whether seen or unseen in training. Trees are built from a root node that is successively split by selecting, among questions about phonetic context, one that provides the best segregation of data. Several goodness of split criterions have been proposed, such as Poisson-based (Bahl et al., 1991), or single Gaussian-based (Bahl et al., 1994), their choice being primarily motivated by computational considerations. We show, from maximum likelihood considerations, how to derive a computationally efficient criterion based on a different approximation using tied mixtures of Gaussian densities
  • Keywords
    Gaussian processes; decision theory; hidden Markov models; maximum likelihood estimation; speech recognition; trees (mathematics); Gaussian densities; Gaussian-based method; HMM mixture density tying; Poisson-based method; acoustic models; bottom-up procedures; context-dependent models; continuous speech recognition system; data segregation; decision trees; goodness of split criterion; hidden Markov model; maximum likelihood estimation; model density clustering; model parameters; phonetic context; top-down clustering; training; two-pass speaker-independent recognition; Context modeling; Decision trees; Gaussian processes; Hidden Markov models; Maximum likelihood estimation; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607126
  • Filename
    607126