• DocumentCode
    323761
  • Title

    Topic adaptation for language modeling using unnormalized exponential models

  • Author

    Chen, Stanley F. ; Seymore, Kristie ; Rosenfeld, Ronald

  • Author_Institution
    Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • Volume
    2
  • fYear
    1998
  • fDate
    12-15 May 1998
  • Firstpage
    681
  • Abstract
    We present novel techniques for performing topic adaptation on an n-gram language model. Given training text labeled with topic information, we automatically identify the most relevant topics for new text. We adapt our language model toward these topics using an exponential model, by adjusting the probabilities in our model to agree with those found in the topical subset of the training data. For efficiency, we do not normalize the model; that is, we do not require that the “probabilities” in the language model sum to 1. With these techniques, we were able to achieve a modest reduction in speech recognition word-error rate in the broadcast news domain
  • Keywords
    broadcasting; grammars; maximum entropy methods; natural languages; probability; speech processing; speech recognition; broadcast news; first-pass transcription likelihood; language modeling; maximum entropy training; n-gram language model; probabilities; robust caching; speech recognition; topic adaptation; topic information; training data; training text; unnormalized exponential models; word-error rate reduction; Adaptation model; Boosting; Broadcasting; DNA; Equations; Frequency; Lattices; Probability; Testing; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
  • Conference_Location
    Seattle, WA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-4428-6
  • Type

    conf

  • DOI
    10.1109/ICASSP.1998.675356
  • Filename
    675356