• DocumentCode
    918829
  • Title

    Mismatched codebooks and the role of entropy coding in lossy data compression

  • Author

    Kontoyiannis, Ioannis ; Zamir, Ram

  • Author_Institution
    Dept. of Comput. Sci., Brown Univ., Providence, RI, USA
  • Volume
    52
  • Issue
    5
  • fYear
    2006
  • fDate
    5/1/2006 12:00:00 AM
  • Firstpage
    1922
  • Lastpage
    1938
  • Abstract
    We introduce a universal quantization scheme based on random coding, and we analyze its performance. This scheme consists of a source-independent random codebook (typically mismatched to the source distribution), followed by optimal entropy coding that is matched to the quantized codeword distribution. A single-letter formula is derived for the rate achieved by this scheme at a given distortion, in the limit of large codebook dimension. The rate reduction due to entropy coding is quantified, and it is shown that it can be arbitrarily large. In the special case of "almost uniform" codebooks (e.g., an independent and identically distributed (i.i.d.) Gaussian codebook with large variance) and difference distortion measures, a novel connection is drawn between the compression achieved by the present scheme and the performance of "universal" entropy-coded dithered lattice quantizers. This connection generalizes the "half-a-bit" bound on the redundancy of dithered lattice quantizers. Moreover, it demonstrates a strong notion of universality where a single "almost uniform" codebook is near optimal for any source and any difference distortion measure. The proofs are based on the fact that the limiting empirical distribution of the first matching codeword in a random codebook can be precisely identified. This is done using elaborate large deviations techniques, that allow the derivation of a new "almost sure" version of the conditional limit theorem.
  • Keywords
    entropy codes; random codes; source coding; vector quantisation; empirical distribution; entropy coded dithered lattice quantizer; lossy data compression; mismatched codebook; quantized codeword distribution; source-independent random codebook; Data compression; Distortion measurement; Entropy coding; Lattices; Pattern matching; Performance analysis; Performance loss; Quantization; Rate-distortion; Robustness; Data compression; large deviations; mismatch; pattern matching; random coding; rate-distortion theory; robustness; universal Gaussian codebook; universal quantization;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/TIT.2006.872845
  • Filename
    1624632