DocumentCode
180429
Title
Adaptive and discriminative modeling for improved mispronunciation detection
Author
Franco, Hugo ; Ferrer, Luciana ; Bratt, Harry
Author_Institution
Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
fYear
2014
fDate
4-9 May 2014
Firstpage
7709
Lastpage
7713
Abstract
In the context of computer-aided language learning, automatic detection of specific phone mispronunciations by nonnative speakers can be used to provide detailed feedback about specific pronunciation problems. In previous work we found that significant improvements could be achieved, compared to standard approaches that compute posteriors with respect to native models, by explicitly modeling both mispronunciations and correct pronunciations by nonnative speakers. In this work, we extend our approach with the use of model adaptation and discriminative modeling techniques, inspired on methods that have been effective in the area of speaker identification. Two systems were developed, one based on Bayesian adaptation of Gaussian Mixture Models (GMMs), and likelihood-ratio-based detection, and another one based on Support Vector Machines classification of supervectors derived from adapted GMMs. Both systems, and their combination, were evaluated in a phonetically transcribed Spanish database of 130,000 phones uttered in continuous speech sentences by 206 nonnative speakers, showing significant improvements from our previous best system.
Keywords
Bayes methods; Gaussian processes; computer aided instruction; mixture models; speaker recognition; speech processing; support vector machines; Bayesian adaptation; GMM; Gaussian mixture models; Spanish database; adaptive modeling; automatic detection; computer-aided language learning; continuous speech sentences; discriminative modeling; improved mispronunciation detection; likelihood-ratio-based detection; model adaptation; nonnative speakers; speaker identification; specific phone mispronunciations; specific pronunciation problems; supervectors; support vector machines classification; Acoustics; Adaptation models; Computational modeling; Databases; Hidden Markov models; Speech; Support vector machines; Mispronunciation detection; computer-aided language learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6855100
Filename
6855100
Link To Document