مرکز منطقه ای اطلاع رساني علوم و فناوري - MLLR/MAP adaptation using pronunciation variation for non-native speech recognition

DocumentCode :

2971704

Title :

MLLR/MAP adaptation using pronunciation variation for non-native speech recognition

Author :

Oh, Yoo Rhee ; Kim, Hong Kook

Author_Institution :

Dept. of Inf. & Commun., Gwangju Inst. of Sci. & Technol. (GIST), Gwangju, South Korea

fYear :

2009

fDate :

Nov. 13 2009-Dec. 17 2009

Firstpage :

216

Lastpage :

221

Abstract :

In this paper, we propose an acoustic model adaptation method based on a maximum likelihood linear regression (MLLR) and a maximum a posteriori (MAP) adaptation using pronunciation variations for non-native speech recognition. To this end, we first obtain pronunciation variations using an indirect data-driven approach. Next, we generate two sets of regression classes: one composed of regression classes for all pronunciations and the other of classes for pronunciation variations. The former are referred to as overall regression classes and the latter as pronunciation variation regression classes. Next, we sequentially apply the two adaptations to non-native speech using the overall regression classes, while the acoustic models associated with the pronunciation variations are adapted using the pronunciation variation regression classes. In the final step, both sets of adapted acoustic models are merged. Thus, the resultant acoustic models can cover the characteristics of non-native speakers as well as the pronunciation variations of non-native speech. It is shown from non-native automatic speech recognition experiments for Korean spoken English continuous speech that an ASR system employing the proposed adaptation method can relatively reduce the average word error rate by 9.43% when compared to a traditional MLLR/MAP adaptation method.

Keywords :

maximum likelihood estimation; regression analysis; speech recognition; MAP; MLLR; acoustic model adaptation method; maximum a posteriori algorithm; maximum likelihood linear regression; nonnative speech recognition; pronunciation variation; Acoustic testing; Adaptation model; Automatic speech recognition; Degradation; Error analysis; Loudspeakers; Maximum likelihood linear regression; Speech processing; Speech recognition; System testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on

Conference_Location :

Merano

Print_ISBN :

978-1-4244-5478-5

Electronic_ISBN :

978-1-4244-5479-2

Type :

conf

DOI :

10.1109/ASRU.2009.5373299

Filename :

5373299

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2971704