• DocumentCode
    3244374
  • Title

    Partial change accent models for accented Mandarin speech recognition

  • Author

    Yi, Liu ; Fung, Pascale

  • Author_Institution
    Dept. of Electr. & Electron. Eng., Univ. of Sci. & Technol., Hong Kong, China
  • fYear
    2003
  • fDate
    30 Nov.-3 Dec. 2003
  • Firstpage
    111
  • Lastpage
    116
  • Abstract
    Regional accents in Mandarin speech result mostly from partial phone changes due to the interlanguage system of non-native speakers. We propose partial change accent models based on accent-specific units with acoustic model reconstruction for accented Mandarin speech recognition. We use phonological rules of dialectical pronunciations together with likelihood ratio test to model actual accented variants rather than inherent phonetic confusions, recognizer errors or other data-specific variations. In order to avoid model confusion and lexical confusion with the increased unit inventory, we improve model resolution through reconstructing the pre-trained acoustic model by using the Gaussian mixtures from accent-specific unit models, where the accent-specific units are treated as hidden models. The effectiveness of this approach is evaluated on Cantonese accented Mandarin speech. Our proposed method yields a significant 4.4 % absolute word error rate (WER) reduction without sacrificing the performance of native speech recognition task. Our reconstructed model can be applied to a single system to handle both accented and native speech.
  • Keywords
    Gaussian distribution; error statistics; hidden Markov models; maximum likelihood estimation; speech processing; speech recognition; Cantonese; Gaussian mixtures; WER reduction; accent-specific unit models; accented Mandarin speech recognition; acoustic model reconstruction; dialectical pronunciations; hidden models; interlanguage system; lexical confusion; likelihood ratio test; model confusion; model resolution; nonnative speakers; partial change accent models; partial phone changes; phonological rules; pre-trained acoustic model; regional accents; word error rate; Acoustic testing; Acoustical engineering; Automatic speech recognition; Dictionaries; Error analysis; Humans; Loudspeakers; Natural languages; Speech analysis; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
  • Print_ISBN
    0-7803-7980-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2003.1318413
  • Filename
    1318413