• DocumentCode
    394345
  • Title

    Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM - MAP decoding and evaluation

  • Author

    Seide, Frank ; Zhou, Jian-lai ; Deng, Li

  • Author_Institution
    5F Beijing Sigma Center, Microsoft Res. Asia, Beijing, China
  • Volume
    1
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    The hidden dynamic model (HDM) has been an attractive acoustic modeling approach because it provides a computational model for coarticulation and the dynamics of human speech. However, the lack of a direct decoding algorithm has been a barrier to research progress on HDM. We have developed a new HDM-based acoustic model, the hidden-trajectory HMM (HTHMM), which combines the state/mixture topology of a traditional monophone HMM with a target-directed hidden-trajectory model (a special form of HDM) for coarticulation modeling. Because the classical Viterbi algorithm is not admissible, we have developed a novel MAP decoding algorithm for HTHMM that correctly takes the hidden continuous trajectory into account. This paper introduces our new HTHMM decoder that allows us for the first time to evaluate an HDM-type model by direct decoding instead of N-best rescoring. Using direct decoding, we demonstrate that the coarticulatory mechanism of our HTHMM matches traditional context-dependent modeling (enumeration of model parameters): The context-independent HTHMM has slightly better accuracy than a crossword-triphone HMM on the Aurora2 task. The decoder also enables us to include state-boundary optimization into the HDM/HTHMM training procedure. This paper presents the detailed decoding algorithm and evaluation results, while in Zhou et al. (2003) we present the HTHMM model itself and parameter training.
  • Keywords
    hidden Markov models; maximum likelihood decoding; speech processing; speech recognition; Aurora2 task; HDM; HMM; HTHMM; MAP decoding; acoustic modeling; coarticulation modeling; direct decoding; embedding; hidden continuous trajectory; hidden dynamic model; hidden-trajectory HMM; monophone HMM; state-boundary optimization; state/mixture topology; target-directed hidden trajectory model; training procedure; Asia; Computational modeling; Context modeling; Decoding; Hidden Markov models; Humans; Speech analysis; Topology; Trajectory; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1198889
  • Filename
    1198889