• DocumentCode
    2973747
  • Title

    Self-supervised discriminative training of statistical language models

  • Author

    Xu, Puyang ; Karakos, Damianos ; Khudanpur, Sanjeev

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Johns Hopkins Univ., Baltimore, MD, USA
  • fYear
    2009
  • fDate
    Nov. 13 2009-Dec. 17 2009
  • Firstpage
    317
  • Lastpage
    322
  • Abstract
    A novel self-supervised discriminative training method for estimating language models for automatic speech recognition (ASR) is proposed. Unlike traditional discriminative training methods that require transcribed speech, only untranscribed speech and a large text corpus is required. An exponential form is assumed for the language model, as done in maximum entropy estimation, but the model is trained from the text using a discriminative criterion that targets word confusions actually witnessed in first-pass ASR output lattices. Specifically, model parameters are estimated to maximize the likelihood ratio between words w in the text corpus and w´s cohorts in the test speech, i.e. other words that w competes with in the test lattices. Empirical results are presented to demonstrate statistically significant improvements over a 4-gram language model on a large vocabulary ASR task.
  • Keywords
    computational linguistics; maximum entropy methods; maximum likelihood estimation; speech recognition; automatic speech recognition; maximum entropy estimation; maximum likelihood ratio; self-supervised discriminative training; statistical language model; Automatic speech recognition; Entropy; Humans; Lattices; Maximum likelihood estimation; Natural languages; Parameter estimation; Speech processing; Speech recognition; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
  • Conference_Location
    Merano
  • Print_ISBN
    978-1-4244-5478-5
  • Electronic_ISBN
    978-1-4244-5479-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2009.5373401
  • Filename
    5373401