DocumentCode
3530726
Title
Extensions of absolute discounting (Kneser-Ney method)
Author
Andrés-Ferrer, Jesús ; Ney, H.
Author_Institution
Univ. Politec. de Valencia, Valencia
fYear
2009
fDate
19-24 April 2009
Firstpage
4729
Lastpage
4732
Abstract
The problem of estimating the parameters of an n-gram language model is a typical problem of estimating small probabilities. So far, two methods have been proposed and used to handle this problem: 1. the empirical Bayes method resulting in the Turing-Good estimates. Theses estimates do not have any constraints and tend to be very noisy. 2. discounting models like absolute (or linear) discounting. The discounting models are heavily constrained and typically have only a single free parameter. Both methods can be formulated in a leaving-one-out framework. In this paper, we study methods that lie between these two extremes. We design models with various types of constraints and derive efficient algorithms for estimating the parameters of these models. We propose two novel types of constraints or models: interval constraints and the exact extended Kneser-Ney model. The proposed methods are implemented and applied to language modelling in order to compare the methods in terms of perplexities. The results show that the new constrained methods outperform other unconstrained methods.
Keywords
Bayes methods; computational linguistics; Bayes method; Kneser-Ney method; Turing-Good estimates; absolute discounting; n-gram language model; Algorithm design and analysis; Bayesian methods; Parameter estimation; Proposals; Smoothing methods; Training data; Kneser-Ney smoothing; language modelling; language smoothing; leaving one out;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location
Taipei
ISSN
1520-6149
Print_ISBN
978-1-4244-2353-8
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2009.4960687
Filename
4960687
Link To Document