• DocumentCode
    1059802
  • Title

    Discriminatively Trained GMMs for Language Classification Using Boosting Methods

  • Author

    Siu, Man-Hung ; Yang, Xi ; Gish, Herbert

  • Author_Institution
    Speech & Language Process. Dept., BBN Technol., Cambridge, MA
  • Volume
    17
  • Issue
    1
  • fYear
    2009
  • Firstpage
    187
  • Lastpage
    197
  • Abstract
    In language identification and other speech applications, discriminatively trained models often outperform nondiscriminative models trained with the maximum-likelihood criterion. For instance, discriminative Gaussian mixture models (GMMs) are typically trained by optimizing some discriminative criteria that can be computationally expensive and complex to implement. In this paper, we explore a novel approach to discriminative GMM training by using a variant the boosting framework (R. Schapire, ldquoThe boosting approach to machine learning, an overview,rdquo Proc. MSRI Workshop on Nonlinear Estimation and Classification, 2002) from machine learning, in which an ensemble of GMMs is trained sequentially. We have extended the purview of boosting to class conditional models (as opposed to discriminative models such as classification trees). The effectiveness of our boosting variation comes from the emphasis on working with the misclassified data to achieve discriminatively trained models. Our variant of boosting also includes utilizing low confidence data classifications as well as misclassified examples in classifier generation. We further apply our boosting approach to anti-models to achieve additional performance gains. We have applied our discriminative training approach to a variety of language identification experiments using the 12-language NIST 2003 language identification task. We show the significant performance improvements that can be obtained. The experiments include both acoustic as well as token-based speech models. Our best performing boosted GMM-based system on the 12-language verification task has a 2.3% EER.
  • Keywords
    Gaussian processes; maximum likelihood estimation; natural language processing; pattern classification; 12-language NIST 2003 language identification task; boosting methods; discriminative Gaussian mixture models; discriminatively trained GMM; language classification; language identification; low confidence data classifications; maximum-likelihood criterion; Boosting; Classification tree analysis; Logistics; Machine learning; Maximum likelihood estimation; NIST; Natural languages; Parameter estimation; Performance gain; Speech processing; Boosting; discriminative training; language identification;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2008.2006653
  • Filename
    4740154