• DocumentCode
    2885429
  • Title

    Improved backing-off for M-gram language modeling

  • Author

    Kneser, Reinhard ; Ney, Hermann

  • Author_Institution
    Philips GmbH Forschungslab., Aachen, Germany
  • Volume
    1
  • fYear
    1995
  • fDate
    9-12 May 1995
  • Firstpage
    181
  • Abstract
    In stochastic language modeling, backing-off is a widely used method to cope with the sparse data problem. In case of unseen events this method backs off to a less specific distribution. In this paper we propose to use distributions which are especially optimized for the task of backing-off. Two different theoretical derivations lead to distributions which are quite different from the probability distributions that are usually used for backing-off. Experiments show an improvement of about 10% in terms of perplexity and 5% in terms of word error rate
  • Keywords
    grammars; natural languages; probability; speech processing; speech recognition; statistical analysis; stochastic processes; backing-off; distributions; experiments; perplexity; sparse data problem; stochastic language modeling; word error rate; Error analysis; History; Interpolation; Laboratories; Natural languages; Probability distribution; Smoothing methods; Stochastic processes; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
  • Conference_Location
    Detroit, MI
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-2431-5
  • Type

    conf

  • DOI
    10.1109/ICASSP.1995.479394
  • Filename
    479394