Title :
Subword-based multi-span pronunciation adaptation for recognizing accented speech
Author :
Mertens, Timo ; Thambiratnam, Kit ; Seide, Frank
Author_Institution :
Dept. of Electron. & Telecommun., Norwegian Univ. of Sci. & Technol., Trondheim, Norway
Abstract :
We investigate automatic pronunciation adaptation for non-native accented speech by using statistical models trained on multi-span lingustic parse tables to generate candidate mispronunciations for a target language. Compared to traditional phone re-writing rules, parse table modeling captures more context in the form of phone-clusters or syllables, and encodes abstract features such as word-internal position or syllable structure. The proposed approach is attractive because it gives a unified method for combining multiple levels of linguistic information. The reported experiments demonstrate word error rate reductions of up to 7.9% and 3.3% absolute on Italian and German accented English using lexicon adaptation alone, and 12.4% and 11.3% absolute when combined with acoustic adaptation.
Keywords :
speech recognition; English; German; Italian; acoustic adaptation; automatic pronunciation adaptation; lexicon adaptation; multi-span lingustic parse tables; non-native accented speech; parse table modeling; phone rewriting rules; phone-clusters; speech recognition; statistical models trained; subword-based multi-span pronunciation adaptation; word-internal position; Adaptation models; Context; Context modeling; Pragmatics; Speech; Training; Training data;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
DOI :
10.1109/ASRU.2011.6163940