• DocumentCode
    3485389
  • Title

    Subword-based multi-span pronunciation adaptation for recognizing accented speech

  • Author

    Mertens, Timo ; Thambiratnam, Kit ; Seide, Frank

  • Author_Institution
    Dept. of Electron. & Telecommun., Norwegian Univ. of Sci. & Technol., Trondheim, Norway
  • fYear
    2011
  • fDate
    11-15 Dec. 2011
  • Firstpage
    254
  • Lastpage
    259
  • Abstract
    We investigate automatic pronunciation adaptation for non-native accented speech by using statistical models trained on multi-span lingustic parse tables to generate candidate mispronunciations for a target language. Compared to traditional phone re-writing rules, parse table modeling captures more context in the form of phone-clusters or syllables, and encodes abstract features such as word-internal position or syllable structure. The proposed approach is attractive because it gives a unified method for combining multiple levels of linguistic information. The reported experiments demonstrate word error rate reductions of up to 7.9% and 3.3% absolute on Italian and German accented English using lexicon adaptation alone, and 12.4% and 11.3% absolute when combined with acoustic adaptation.
  • Keywords
    speech recognition; English; German; Italian; acoustic adaptation; automatic pronunciation adaptation; lexicon adaptation; multi-span lingustic parse tables; non-native accented speech; parse table modeling; phone rewriting rules; phone-clusters; speech recognition; statistical models trained; subword-based multi-span pronunciation adaptation; word-internal position; Adaptation models; Context; Context modeling; Pragmatics; Speech; Training; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
  • Conference_Location
    Waikoloa, HI
  • Print_ISBN
    978-1-4673-0365-1
  • Electronic_ISBN
    978-1-4673-0366-8
  • Type

    conf

  • DOI
    10.1109/ASRU.2011.6163940
  • Filename
    6163940