DocumentCode
178692
Title
Improved and robust prediction of pronunciation distance for individual-basis clustering of World Englishes pronunciation
Author
Kasahara, Shun ; Kitahara, Saori ; Minematsu, Nobuaki ; Shen, H.-P. ; Makino, Tatsuya ; Saito, Daisuke ; Hiorse, K.
Author_Institution
Univ. of Tokyo, Tokyo, Japan
fYear
2014
fDate
4-9 May 2014
Firstpage
3216
Lastpage
3220
Abstract
English is the only language available for global communication and is used by approximately 1.5 billions of speakers. It is also known to have a large diversity of pronunciation due to the influence of speakers´ mother tongue, called accents. Our project aims at creating a global and individual-basis map of English pronunciations to be used in teaching and learning World Englishes (WE) as well as research studies of WE [1, 2]. Creating the map mathematically requires a distance matrix in terms of pronunciation differences among all the speakers considered, and technically requires a method of predicting the pronunciation distance between any pair of the speakers only by using their speech samples. In our previous study [3], we combined invariant pronunciation structure analysis [4, 5, 6, 7] and Support Vector Regression (SVR) to predict the inter-speaker pronunciation distances. In this paper, several techniques are introduced and examined whether they can increase accuracy and robustness of prediction. Experiments show that the correlation between IPA-based reference distances and the predicted distances is increased from 0.805 to 0.903, which is over the correlation of 0.829 that is obtained by using the phoneme-based ground truth distances.
Keywords
natural language processing; pattern clustering; speech processing; English pronunciations; IPA-based reference distances; World Englishes; distance matrix; individual-basis clustering; phoneme-based ground truth distances; pronunciation distance prediction; speech samples; Acoustics; Correlation; Educational institutions; Hidden Markov models; Robustness; Speech; Training; IPA transcription; SAA; World Englishes; f-divergence; phoneme recognition; pronunciation clustering; pronunciation structure analysis; support vector regression;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6854194
Filename
6854194
Link To Document