• DocumentCode
    417280
  • Title

    Segmental tonal modeling for phone set design in Mandarin LVCSR

  • Author

    Huang, Chao ; Shi, Yu ; Zhou, Jianlai ; Chu, Min ; Wang, Terry ; Chang, Eric

  • Author_Institution
    Microsoft Res. Asia, Beijing, China
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    Modeling units play a very important role in state-of-art speech recognition systems. The design and selection of them will directly impact the performance of the final speech recognition engine. As a tonal language, Mandarin´s modeling units are more special for the tonal processing. In this paper, after fully investigating several dominant modeling strategies, we propose a new phone set design strategy for Mandarin, called segmental tonal modeling. Instead of modeling tone types directly, we realize them implicitly and jointly by two segments, which both carry tonal information. Both HTK and SAPI based experiments confirmed that such a method is very efficient. In addition to improving the accuracy by 9-23%, it greatly reduces the decoding time by 30-45%. Given the similar decoding speed, the new phone set configuration can reduce the error rate by relatively 35%.
  • Keywords
    error statistics; speech processing; speech recognition; HTK; Mandarin LVCSR; SAPI; error rate reduction; performance; phone set design; segmental tonal modeling; state-of-art speech recognition systems; tonal language; Asia; Chaos; Context modeling; Decoding; Engines; Error analysis; Natural languages; Speech recognition; Tail;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326132
  • Filename
    1326132