Title :
Universal compression for I.I.D. sources with large alphabets
Author_Institution :
Dept. of Electr. & Comput. Eng., Utah Univ., Salt Lake City, UT, USA
fDate :
29 June-4 July 2003
Abstract :
The minimum description length (MDL) principle is derived for universal compression of i.i.d. sources with large alphabets of size k that may be up to sub-linear with the data sequence length n. Each unknown source probability parameter is shown to cost 0.5log(n/k) bits. This result is shown to be a lower bound in the average minimax sense, and also for most sources in the class. The bound is shown to be achievable even sequentially with the well-known Krichevsky-Trofimov low-complexity scheme.
Keywords :
minimax techniques; probability; source coding; Krichevsky-Trofimov low-complexity scheme; data sequence length; large alphabets; minimax sense; minimum description length; source probability; universal compression; Channel capacity; Cities and towns; Costs; Entropy; Error probability; Gas insulated transmission lines; Helium; Minimax techniques; Random sequences; Redundancy;
Conference_Titel :
Information Theory, 2003. Proceedings. IEEE International Symposium on
Print_ISBN :
0-7803-7728-1
DOI :
10.1109/ISIT.2003.1228038