DocumentCode :
2971704
Title :
MLLR/MAP adaptation using pronunciation variation for non-native speech recognition
Author :
Oh, Yoo Rhee ; Kim, Hong Kook
Author_Institution :
Dept. of Inf. & Commun., Gwangju Inst. of Sci. & Technol. (GIST), Gwangju, South Korea
fYear :
2009
fDate :
Nov. 13 2009-Dec. 17 2009
Firstpage :
216
Lastpage :
221
Abstract :
In this paper, we propose an acoustic model adaptation method based on a maximum likelihood linear regression (MLLR) and a maximum a posteriori (MAP) adaptation using pronunciation variations for non-native speech recognition. To this end, we first obtain pronunciation variations using an indirect data-driven approach. Next, we generate two sets of regression classes: one composed of regression classes for all pronunciations and the other of classes for pronunciation variations. The former are referred to as overall regression classes and the latter as pronunciation variation regression classes. Next, we sequentially apply the two adaptations to non-native speech using the overall regression classes, while the acoustic models associated with the pronunciation variations are adapted using the pronunciation variation regression classes. In the final step, both sets of adapted acoustic models are merged. Thus, the resultant acoustic models can cover the characteristics of non-native speakers as well as the pronunciation variations of non-native speech. It is shown from non-native automatic speech recognition experiments for Korean spoken English continuous speech that an ASR system employing the proposed adaptation method can relatively reduce the average word error rate by 9.43% when compared to a traditional MLLR/MAP adaptation method.
Keywords :
maximum likelihood estimation; regression analysis; speech recognition; MAP; MLLR; acoustic model adaptation method; maximum a posteriori algorithm; maximum likelihood linear regression; nonnative speech recognition; pronunciation variation; Acoustic testing; Adaptation model; Automatic speech recognition; Degradation; Error analysis; Loudspeakers; Maximum likelihood linear regression; Speech processing; Speech recognition; System testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location :
Merano
Print_ISBN :
978-1-4244-5478-5
Electronic_ISBN :
978-1-4244-5479-2
Type :
conf
DOI :
10.1109/ASRU.2009.5373299
Filename :
5373299
Link To Document :
بازگشت