• DocumentCode
    3559045
  • Title

    On the Construction of (Explicit) Khodak´s Code and Its Analysis

  • Author

    Bugeaud, Yann ; Drmota, Michael ; Szpankowski, Wojciech

  • Author_Institution
    Dept. de Math., Univ. Louis Pasteur, Strasbourg
  • Volume
    54
  • Issue
    11
  • fYear
    2008
  • Firstpage
    5073
  • Lastpage
    5086
  • Abstract
    Variable-to-variable (VV) codes are very attractive yet not well understood data compression schemes. In 1972, Khodak claimed to provide upper and lower bounds for the achievable redundancy rate, however, he did not offer explicit construction of such codes. In this paper, we first present a constructive and transparent proof of Khodak´s result showing that for memoryless sources there exists a code with the average redundancy bounded by D -5/3, where D is the average delay (e.g., the average length of a dictionary entry). We also describe an algorithm that constructs a VV length code with a small redundancy rate for large D. Then, we discuss several generalizations. We prove that the worst case redundancy does not exceed D -4/3. Furthermore, we provide similar upper bound for Markov sources (of order 1). Finally, we consider bounds that are valid for almost all memoryless and Markov sources for which the set of exceptional source parameters has zero measure. In particular, for all memoryless sources outside this exceptional class, we prove there exists a VV code with the average redundancy rate bounded by D -1-m/3+ epsiv and the worst case redundancy rate bounded by D -1-m/3+ epsiv, where m is the cardinality of the alphabet. We complete our analysis with a lower bound showing that for all VV codes the average and the worst case redundancy rates are at least D -2m-1- epsiv for almost all memoryless sources in the sense that the set of exceptional source parameters has zero measure. We prove these results using techniques of Diophantine approximations.
  • Keywords
    Markov processes; data compression; redundancy; variable length codes; Diophantine approximation technique; Khodak code analysis; Markov source; VV code; average redundancy rate; data compression scheme; memoryless source; variable-to-variable codes; Computer science; Data compression; Delay; Dictionaries; Encoding; Information analysis; Information theory; Source coding; Upper bound; Average and maximal redundancy rates; metric Diophantine approximations; variable-to-variable (VV) length codes;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/TIT.2008.929959
  • Filename
    4655437