• DocumentCode
    900297
  • Title

    Maximum entropy direct models for speech recognition

  • Author

    Kuo, Hong-Kwang Jeff ; Gao, Yuqing

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    14
  • Issue
    3
  • fYear
    2006
  • fDate
    5/1/2006 12:00:00 AM
  • Firstpage
    873
  • Lastpage
    881
  • Abstract
    Traditional statistical models for speech recognition have mostly been based on a Bayesian framework using generative models such as hidden Markov models (HMMs). This paper focuses on a new framework for speech recognition using maximum entropy direct modeling, where the probability of a state or word sequence given an observation sequence is computed directly from the model. In contrast to HMMs, features can be asynchronous and overlapping. This model therefore allows for the potential combination of many different types of features, which need not be statistically independent of each other. In this paper, a specific kind of direct model, the maximum entropy Markov model (MEMM), is studied. Even with conventional acoustic features, the approach already shows promising results for phone level decoding. The MEMM significantly outperforms traditional HMMs in word error rate when used as stand-alone acoustic models. Preliminary results combining the MEMM scores with HMM and language model scores show modest improvements over the best HMM speech recognizer.
  • Keywords
    hidden Markov models; maximum entropy methods; speech recognition; maximum entropy Markov model; maximum entropy direct models; speech recognition; stand-alone acoustic models; word error rate; Bayesian methods; Data mining; Decoding; Entropy; Error analysis; Hidden Markov models; Natural languages; Probability; Speech recognition; State-space methods; Direct modeling; maximum entropy acoustic modeling; nongenerative modeling;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TSA.2005.858064
  • Filename
    1621200