Title :
Adaptive and discriminative modeling for improved mispronunciation detection
Author :
Franco, Hugo ; Ferrer, Luciana ; Bratt, Harry
Author_Institution :
Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
Abstract :
In the context of computer-aided language learning, automatic detection of specific phone mispronunciations by nonnative speakers can be used to provide detailed feedback about specific pronunciation problems. In previous work we found that significant improvements could be achieved, compared to standard approaches that compute posteriors with respect to native models, by explicitly modeling both mispronunciations and correct pronunciations by nonnative speakers. In this work, we extend our approach with the use of model adaptation and discriminative modeling techniques, inspired on methods that have been effective in the area of speaker identification. Two systems were developed, one based on Bayesian adaptation of Gaussian Mixture Models (GMMs), and likelihood-ratio-based detection, and another one based on Support Vector Machines classification of supervectors derived from adapted GMMs. Both systems, and their combination, were evaluated in a phonetically transcribed Spanish database of 130,000 phones uttered in continuous speech sentences by 206 nonnative speakers, showing significant improvements from our previous best system.
Keywords :
Bayes methods; Gaussian processes; computer aided instruction; mixture models; speaker recognition; speech processing; support vector machines; Bayesian adaptation; GMM; Gaussian mixture models; Spanish database; adaptive modeling; automatic detection; computer-aided language learning; continuous speech sentences; discriminative modeling; improved mispronunciation detection; likelihood-ratio-based detection; model adaptation; nonnative speakers; speaker identification; specific phone mispronunciations; specific pronunciation problems; supervectors; support vector machines classification; Acoustics; Adaptation models; Computational modeling; Databases; Hidden Markov models; Speech; Support vector machines; Mispronunciation detection; computer-aided language learning;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6855100