• DocumentCode
    179579
  • Title

    Highly accurate phonetic segmentation using boundary correction models and system fusion

  • Author

    Stolcke, Andreas ; Ryant, Neville ; Mitra, Ved ; Jiahong Yuan ; Wen Wang ; Liberman, Mark

  • Author_Institution
    Microsoft Res., Mountain View, CA, USA
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    5552
  • Lastpage
    5556
  • Abstract
    Accurate phone-level segmentation of speech remains an important task for many subfields of speech research. We investigate techniques for boosting the accuracy of automatic phonetic segmentation based on HMM acoustic-phonetic models. In prior work [25] we were able to improve on state-of-the-art alignment accuracy by employing special phone boundary HMM models, trained on phonetically segmented training data, in conjunction with a simple boundary-time correction model. Here we present further improved results by using more powerful statistical models for boundary correction that are conditioned on phonetic context and duration features. Furthermore, we find that combining multiple acoustic front-ends gives additional gains in accuracy, and that conditioning the combiner on phonetic context and side information helps. Overall, we reduce segmentation errors on the TIMIT corpus by almost one half, from 93.9% to 96.8% boundary accuracy with a 20-ms tolerance.
  • Keywords
    hidden Markov models; speech processing; statistical analysis; HMM acoustic-phonetic models; TIMIT; acoustic front-ends; automatic phonetic segmentation; boundary correction models; boundary-time correction model; phone boundary HMM models; phone-level segmentation; phonetic context; phonetic segmentation; phonetically segmented training data; segmentation errors; speech research; statistical models; system fusion; Acoustics; Conferences; Decision support systems; Speech; Speech processing; HMM; forced alignment; phone boundary model; phonetic segmentation; regression; system fusion;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854665
  • Filename
    6854665