• DocumentCode
    2979192
  • Title

    ACID/HNN: a framework for hierarchical connectionist acoustic modeling

  • Author

    Fritsch, Jürgen

  • Author_Institution
    Karlsruhe Univ., Germany
  • fYear
    1997
  • fDate
    14-17 Dec 1997
  • Firstpage
    164
  • Lastpage
    171
  • Abstract
    We propose the ACID/HNN framework for context dependent large vocabulary conversational speech recognition (LVCSR) using connectionist acoustic models. Our approach advocates the principles of modularity and hierarchy for the estimation of thousands of context dependent posterior HMM state probabilities. We argue that a hierarchical organization of the acoustic model is crucial in obtaining competitive performance with connectionist estimators. We introduce ACID, an Agglomerative Clustering scheme based on information divergence and use it to induce soft decision trees for hierarchical classification. A Hierarchy of Neural Networks (HNN) is then applied to the estimation of conditional posterior probabilities. We discuss the benefits of hierarchically structured acoustic models for speaker adaptation and scoring speed-up. Finally, we present experiments on the Switchboard conversational telephone speech corpus, currently a major focus of research in the LVCSR community
  • Keywords
    hidden Markov models; natural languages; probability; speech recognition; trees (mathematics); ACID/HNN framework; Agglomerative Clustering scheme; Hierarchy of Neural Networks; LVCSR; Switchboard conversational telephone speech corpus; competitive performance; conditional posterior probabilities; connectionist acoustic models; connectionist estimators; context dependent large vocabulary conversational speech recognition; context dependent posterior HMM state probability estimation; hierarchical classification; hierarchical connectionist acoustic modeling; hierarchical organization; hierarchically structured acoustic models; information divergence; neural network hierarchy; soft decision trees; speaker adaptation; Adaptation model; Classification tree analysis; Context modeling; Decision trees; Hidden Markov models; Loudspeakers; Neural networks; Speech recognition; State estimation; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on
  • Conference_Location
    Santa Barbara, CA
  • Print_ISBN
    0-7803-3698-4
  • Type

    conf

  • DOI
    10.1109/ASRU.1997.659001
  • Filename
    659001