Adaptive and discriminative modeling for improved mispronunciation detection

Author

Franco, Hugo ; Ferrer, Luciana ; Bratt, Harry

Author_Institution

Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA

fYear

2014

fDate

4-9 May 2014

Firstpage

7709

Lastpage

7713

Abstract

In the context of computer-aided language learning, automatic detection of specific phone mispronunciations by nonnative speakers can be used to provide detailed feedback about specific pronunciation problems. In previous work we found that significant improvements could be achieved, compared to standard approaches that compute posteriors with respect to native models, by explicitly modeling both mispronunciations and correct pronunciations by nonnative speakers. In this work, we extend our approach with the use of model adaptation and discriminative modeling techniques, inspired on methods that have been effective in the area of speaker identification. Two systems were developed, one based on Bayesian adaptation of Gaussian Mixture Models (GMMs), and likelihood-ratio-based detection, and another one based on Support Vector Machines classification of supervectors derived from adapted GMMs. Both systems, and their combination, were evaluated in a phonetically transcribed Spanish database of 130,000 phones uttered in continuous speech sentences by 206 nonnative speakers, showing significant improvements from our previous best system.

Keywords

Bayes methods; Gaussian processes; computer aided instruction; mixture models; speaker recognition; speech processing; support vector machines; Bayesian adaptation; GMM; Gaussian mixture models; Spanish database; adaptive modeling; automatic detection; computer-aided language learning; continuous speech sentences; discriminative modeling; improved mispronunciation detection; likelihood-ratio-based detection; model adaptation; nonnative speakers; speaker identification; specific phone mispronunciations; specific pronunciation problems; supervectors; support vector machines classification; Acoustics; Adaptation models; Computational modeling; Databases; Hidden Markov models; Speech; Support vector machines; Mispronunciation detection; computer-aided language learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6855100

Filename

6855100