• DocumentCode
    353714
  • Title

    Selecting articles from the language model training corpus

  • Author

    Klakow, Dietrich

  • Author_Institution
    Philips GmbH Forschungslab., Aachen, Germany
  • Volume
    3
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1695
  • Abstract
    The paper suggests the use of a log-likelihood based criterion to select articles from a training corpus that are suitable to reduce perplexity on a specific task defined by a small target corpus. This method is not only efficient as an adaptation technique reducing perplexity by 32% and OOV rate from 4.2% to 2.7% but also as a pruning technique, decreasing the language model size by a factor of 3 at the same time
  • Keywords
    computational linguistics; linguistics; modelling; speech processing; OOV rate; adaptation technique; article selection; language model size; language model training corpus; log-likelihood based criterion; pruning technique; small target corpus; Adaptation model; Educational technology; Entropy; Interpolation; Optimization methods; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-6293-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2000.862077
  • Filename
    862077