• DocumentCode
    779955
  • Title

    Trajectory Clustering for Solving the Trajectory Folding Problem in Automatic Speech Recognition

  • Author

    Han, Yan ; De Veth, Johan ; Boves, Lou

  • Author_Institution
    Center for Language & Speech Technol., Radboud Univ. Nijmegen
  • Volume
    15
  • Issue
    4
  • fYear
    2007
  • fDate
    5/1/2007 12:00:00 AM
  • Firstpage
    1425
  • Lastpage
    1434
  • Abstract
    In this paper, we introduce a novel method for clustering speech gestures, represented as continuous trajectories in acoustic parameter space. Trajectory Clustering allows us to avoid the conditional independence assumption that makes it difficult to account for the fact that successive measurements of an articulatory gesture are correlated. We apply the trajectory clustering method for developing multiple parallel hidden Markov models (HMMs) for a continuous digits recognition task. We compare the performance obtained with data-driven clustering to the recognition performance obtained with conventional head-body-tail models, which use knowledge-based criteria for building multiple HMMs in order to obviate the trajectory folding problem. The results show that trajectory clustering is able to discover structure in the the training database that is different from the structure assumed by the knowledge-based approach. In addition, the data-derived structure gives rise to significantly better recognition performance, and results in a 10% word error rate reduction
  • Keywords
    hidden Markov models; speech recognition; acoustic parameter space; articulatory gesture; automatic speech recognition; continuous digits recognition task; data-derived structure; multiple parallel hidden Markov models; trajectory clustering; trajectory folding problem; Acoustic measurements; Automatic speech recognition; Buildings; Clustering methods; Databases; Error analysis; Hidden Markov models; High definition video; Natural languages; Speech processing; Automatic speech recognition; folding, mixture of regressions; multiple-path hidden Markov model; speech trajectory clustering; trajectory folding;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2007.894529
  • Filename
    4156197