DocumentCode :
302106
Title :
Back-off method for n-gram smoothing based on binomial posteriori distribution
Author :
Kawabata, Takeshi ; Tamoto, Masafumi
Author_Institution :
NTT Basic Res. Labs., Atsugi, Japan
Volume :
1
fYear :
1996
fDate :
7-10 May 1996
Firstpage :
192
Abstract :
The n-gram language model is powerful for treating natural spoken language, however it requires large amounts of spoken language corpus to estimate reliable model parameters. To estimate n-gram probabilities from sparse data, Katz´s (1987) back-off smoothing method is promising. However, this approach is sometimes unstable because it uses singleton heuristics based on Turing´s formula. This paper proposes a new back-off method based on binomial posteriori distribution of n-gram probabilities, which achieves stable and more effective n-gram smoothing using a sophisticated calculation formula with no heuristics
Keywords :
binomial distribution; natural languages; parameter estimation; smoothing methods; speech recognition; binomial posteriori distribution; n-gram language model; n-gram probabilities; n-gram smoothing; natural spoken language; sparse data; Equations; Laboratories; Natural languages; Parameter estimation; Probability distribution; Smoothing methods; Statistical distributions; Statistics; Stochastic processes; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location :
Atlanta, GA
ISSN :
1520-6149
Print_ISBN :
0-7803-3192-3
Type :
conf
DOI :
10.1109/ICASSP.1996.540323
Filename :
540323
Link To Document :
بازگشت