DocumentCode :
672356
Title :
Automatic pronunciation clustering using a World English archive and pronunciation structure analysis
Author :
Shen, H.-P. ; Minematsu, Nobuaki ; Makino, Tatsuya ; Weinberger, S.H. ; Pongkittiphan, T. ; Wu, Chao-Hsin
Author_Institution :
Nat. Cheng Kung Univ., Tainan, Taiwan
fYear :
2013
fDate :
8-12 Dec. 2013
Firstpage :
222
Lastpage :
227
Abstract :
English is the only language available for global communication. Due to the influence of speakers´ mother tongue, however, those from different regions inevitably have different accents in their pronunciation of English. The ultimate goal of our project is creating a global pronunciation map of World Englishes on an individual basis, for speakers to use to locate similar English pronunciations. If the speaker is a learner, he can also know how his pronunciation compares to other varieties. Creating the map mathematically requires a matrix of pronunciation distances among all the speakers considered. This paper investigates invariant pronunciation structure analysis and Support Vector Regression (SVR) to predict the inter-speaker pronunciation distances. In experiments, the Speech Accent Archive (SAA), which contains speech data of worldwide accented English, is used as training and testing samples. IPA narrow transcriptions in the archive are used to prepare reference pronunciation distances, which are then predicted based on structural analysis and SVR, not with IPA transcriptions. Correlation between the reference distances and the predicted distances is calculated. Experimental results show very promising results and our proposed method outperforms by far a baseline system developed using an HMM-based phoneme recognizer.
Keywords :
hidden Markov models; information retrieval systems; natural language processing; records management; regression analysis; support vector machines; HMM based phoneme recognizer; IPA narrow transcriptions; SAA; SVR; automatic pronunciation clustering; english pronunciation; global communication; interspeaker pronunciation; pronunciation structure analysis; speakers mother tongue; speech accent archive; speech data; support vector regression; world English archive; Correlation; Educational institutions; Grammar; Hidden Markov models; Speech; Support vector machines; Training; World Englishes; f-divergence; pronunciation structure analysis; speaker-based pronunciation clustering; support vector regression;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
Conference_Location :
Olomouc
Type :
conf
DOI :
10.1109/ASRU.2013.6707733
Filename :
6707733
Link To Document :
بازگشت