DocumentCode
302106
Title
Back-off method for n-gram smoothing based on binomial posteriori distribution
Author
Kawabata, Takeshi ; Tamoto, Masafumi
Author_Institution
NTT Basic Res. Labs., Atsugi, Japan
Volume
1
fYear
1996
fDate
7-10 May 1996
Firstpage
192
Abstract
The n-gram language model is powerful for treating natural spoken language, however it requires large amounts of spoken language corpus to estimate reliable model parameters. To estimate n-gram probabilities from sparse data, Katz´s (1987) back-off smoothing method is promising. However, this approach is sometimes unstable because it uses singleton heuristics based on Turing´s formula. This paper proposes a new back-off method based on binomial posteriori distribution of n-gram probabilities, which achieves stable and more effective n-gram smoothing using a sophisticated calculation formula with no heuristics
Keywords
binomial distribution; natural languages; parameter estimation; smoothing methods; speech recognition; binomial posteriori distribution; n-gram language model; n-gram probabilities; n-gram smoothing; natural spoken language; sparse data; Equations; Laboratories; Natural languages; Parameter estimation; Probability distribution; Smoothing methods; Statistical distributions; Statistics; Stochastic processes; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location
Atlanta, GA
ISSN
1520-6149
Print_ISBN
0-7803-3192-3
Type
conf
DOI
10.1109/ICASSP.1996.540323
Filename
540323
Link To Document