• DocumentCode
    774569
  • Title

    Capacity and complexity of HMM duration modeling techniques

  • Author

    Johnson, Michael T.

  • Author_Institution
    Electr. & Comput. Eng. Dept., Marquette Univ., Milwaukee, WI, USA
  • Volume
    12
  • Issue
    5
  • fYear
    2005
  • fDate
    5/1/2005 12:00:00 AM
  • Firstpage
    407
  • Lastpage
    410
  • Abstract
    The ability of a standard hidden Markov model (HMM) or expanded state HMM (ESHMM) to accurately model duration distributions of phonemes is compared with specific duration-focused approaches such as semi-Markov models or variable transition probabilities. It is demonstrated that either a three-state ESHMM or a standard HMM with an increased number of states is capable of closely matching both Gamma distributions and duration distributions of phonemes from the TIMIT corpus, as measured by Bhattacharyya distance to the true distributions. Standard HMMs are easily implemented with off-the-shelf tools, whereas duration models require substantial algorithmic development and have higher computational costs when implemented, suggesting that a simple adjustment to HMM topologies is perhaps a more efficient solution to the problem of duration than more complex approaches.
  • Keywords
    gamma distribution; hidden Markov models; speech recognition; Gamma distributions; HMM modeling techniques; TIMIT corpus; phonemes duration distribution; speech recognition; standard hidden Markov model; Computational efficiency; Hidden Markov models; Measurement standards; Network topology; Random processes; Silicon compounds; Speech recognition; Standards development; Duration models; hidden Markov models; speech recognition;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2005.845598
  • Filename
    1420352