• DocumentCode
    302106
  • Title

    Back-off method for n-gram smoothing based on binomial posteriori distribution

  • Author

    Kawabata, Takeshi ; Tamoto, Masafumi

  • Author_Institution
    NTT Basic Res. Labs., Atsugi, Japan
  • Volume
    1
  • fYear
    1996
  • fDate
    7-10 May 1996
  • Firstpage
    192
  • Abstract
    The n-gram language model is powerful for treating natural spoken language, however it requires large amounts of spoken language corpus to estimate reliable model parameters. To estimate n-gram probabilities from sparse data, Katz´s (1987) back-off smoothing method is promising. However, this approach is sometimes unstable because it uses singleton heuristics based on Turing´s formula. This paper proposes a new back-off method based on binomial posteriori distribution of n-gram probabilities, which achieves stable and more effective n-gram smoothing using a sophisticated calculation formula with no heuristics
  • Keywords
    binomial distribution; natural languages; parameter estimation; smoothing methods; speech recognition; binomial posteriori distribution; n-gram language model; n-gram probabilities; n-gram smoothing; natural spoken language; sparse data; Equations; Laboratories; Natural languages; Parameter estimation; Probability distribution; Smoothing methods; Statistical distributions; Statistics; Stochastic processes; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
  • Conference_Location
    Atlanta, GA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-3192-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1996.540323
  • Filename
    540323