DocumentCode :
2056856
Title :
Algorithms for modeling distributions over large alphabets
Author :
Orlitsky, Alon ; Sajama ; Santhanam, Narayana ; Viswanathan, Krishnamurthy ; Zhang, Junan
Author_Institution :
Dept. of Electr. & Comput. Eng., California Univ., San Diego, La Jolla, CA
fYear :
2004
fDate :
2004
Firstpage :
304
Lastpage :
304
Abstract :
We consider the problem of modeling a distribution whose alphabet size is large relative to the amount of observed data. It is well known that conventional maximum-likelihood estimates do not perform well in that regime. Instead, we find the distribution maximizing the probability of the data´s pattern. We derive an efficient algorithm for approximating this distribution. Simulations show that the computed distribution models the data well and yields general estimators that evaluate various data attributes as well as specific estimators designed especially for these tasks
Keywords :
data compression; estimation theory; probability; sequences; alphabet size; data pattern; distribution model; probability; Algorithm design and analysis; Computational modeling; Data engineering; Distributed computing; Frequency; Lagrangian functions; Maximum likelihood estimation; Probability distribution; Testing; Yield estimation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Theory, 2004. ISIT 2004. Proceedings. International Symposium on
Conference_Location :
Chicago, IL
Print_ISBN :
0-7803-8280-3
Type :
conf
DOI :
10.1109/ISIT.2004.1365341
Filename :
1365341
Link To Document :
بازگشت