• DocumentCode
    180429
  • Title

    Adaptive and discriminative modeling for improved mispronunciation detection

  • Author

    Franco, Hugo ; Ferrer, Luciana ; Bratt, Harry

  • Author_Institution
    Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    7709
  • Lastpage
    7713
  • Abstract
    In the context of computer-aided language learning, automatic detection of specific phone mispronunciations by nonnative speakers can be used to provide detailed feedback about specific pronunciation problems. In previous work we found that significant improvements could be achieved, compared to standard approaches that compute posteriors with respect to native models, by explicitly modeling both mispronunciations and correct pronunciations by nonnative speakers. In this work, we extend our approach with the use of model adaptation and discriminative modeling techniques, inspired on methods that have been effective in the area of speaker identification. Two systems were developed, one based on Bayesian adaptation of Gaussian Mixture Models (GMMs), and likelihood-ratio-based detection, and another one based on Support Vector Machines classification of supervectors derived from adapted GMMs. Both systems, and their combination, were evaluated in a phonetically transcribed Spanish database of 130,000 phones uttered in continuous speech sentences by 206 nonnative speakers, showing significant improvements from our previous best system.
  • Keywords
    Bayes methods; Gaussian processes; computer aided instruction; mixture models; speaker recognition; speech processing; support vector machines; Bayesian adaptation; GMM; Gaussian mixture models; Spanish database; adaptive modeling; automatic detection; computer-aided language learning; continuous speech sentences; discriminative modeling; improved mispronunciation detection; likelihood-ratio-based detection; model adaptation; nonnative speakers; speaker identification; specific phone mispronunciations; specific pronunciation problems; supervectors; support vector machines classification; Acoustics; Adaptation models; Computational modeling; Databases; Hidden Markov models; Speech; Support vector machines; Mispronunciation detection; computer-aided language learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6855100
  • Filename
    6855100