Title :
Universal compression of power-law distributions
Author :
Moein Falahatgar;Ashkan Jafarpour;Alon Orlitsky;Venkatadheeraj Pichapati;Ananda Theertha Suresh
Author_Institution :
University of California San Diego, USA
fDate :
6/1/2015 12:00:00 AM
Abstract :
English words and the outputs of many other natural processes are well-known to follow a Zipf distribution. Yet this thoroughly-established property has never been shown to help compress or predict these important processes. We show that the expected redundancy of Zipf distributions of order α > 1 is roughly the 1/α power of the expected redundancy of unrestricted distributions. Hence for these orders, Zipf distributions can be better compressed and predicted than was previously known. Unlike the expected case, we show that worst-case redundancy is roughly the same for Zipf and for unrestricted distributions. Hence Zipf distributions have significantly different worst-case and expected redundancies, making them the first natural distribution class shown to have such a difference.
Keywords :
"Redundancy","Entropy","Upper bound","Encoding","Dictionaries","Random variables","Markov processes"
Conference_Titel :
Information Theory (ISIT), 2015 IEEE International Symposium on
Electronic_ISBN :
2157-8117
DOI :
10.1109/ISIT.2015.7282806