DocumentCode :
3615730
Title :
On the relation between additive smoothing and universal coding [language modeling]
Author :
N. Jevtic;A. Orlitsky
Author_Institution :
ECE Dept., Univ. of California San Diego, La Jolla, CA, USA
fYear :
2003
fDate :
6/25/1905 12:00:00 AM
Firstpage :
537
Lastpage :
542
Abstract :
We analyze the performance of smoothing methods for language modeling from the perspective of universal compression. We use existing asymptotic bounds on the performance of simple additive rules for compression of finite-alphabet memoryless sources to explain the empirical predictive abilities of additive smoothing techniques. We further suggest a smoothing method that overcomes some of the problems observed in previous approaches. The new method outperforms existing ones on the Wall Street Journal (WSJ) database for bigram and trigram models. We then suggest possible directions for future research.
Keywords :
"Smoothing methods","Natural languages","Performance analysis","Speech recognition","Probability distribution","History","Training data","Laplace equations","Databases","Handwriting recognition"
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU ´03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
Type :
conf
DOI :
10.1109/ASRU.2003.1318497
Filename :
1318497
Link To Document :
بازگشت