• DocumentCode
    397320
  • Title

    Universal compression for I.I.D. sources with large alphabets

  • Author

    Shamir, Gil I.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Utah Univ., Salt Lake City, UT, USA
  • fYear
    2003
  • fDate
    29 June-4 July 2003
  • Firstpage
    24
  • Abstract
    The minimum description length (MDL) principle is derived for universal compression of i.i.d. sources with large alphabets of size k that may be up to sub-linear with the data sequence length n. Each unknown source probability parameter is shown to cost 0.5log(n/k) bits. This result is shown to be a lower bound in the average minimax sense, and also for most sources in the class. The bound is shown to be achievable even sequentially with the well-known Krichevsky-Trofimov low-complexity scheme.
  • Keywords
    minimax techniques; probability; source coding; Krichevsky-Trofimov low-complexity scheme; data sequence length; large alphabets; minimax sense; minimum description length; source probability; universal compression; Channel capacity; Cities and towns; Costs; Entropy; Error probability; Gas insulated transmission lines; Helium; Minimax techniques; Random sequences; Redundancy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Theory, 2003. Proceedings. IEEE International Symposium on
  • Print_ISBN
    0-7803-7728-1
  • Type

    conf

  • DOI
    10.1109/ISIT.2003.1228038
  • Filename
    1228038