• DocumentCode
    3431871
  • Title

    Some entropic bounds for Lempel-Ziv algorithms

  • Author

    Kosaraju, S. Rao ; Manzini, Giovanni

  • Author_Institution
    Dept. of Comput. Sci., Johns Hopkins Univ., Baltimore, MD, USA
  • fYear
    1997
  • fDate
    25-27 Mar 1997
  • Firstpage
    446
  • Abstract
    Summary form only given, as follows. We initiate a study of parsing-based compression algorithms such as LZ77 and LZ78 by considering the empirical entropy of the input string. For any string s, we define the k-th order entropy Hk(s) by looking at the number of occurrences of each symbol following each k-length substring inside s. The value Hk(s) is a lower bound to the compression ratio of a statistical modeling algorithm which predicts the probability of the next symbol by looking at the k most recently seen characters. Therefore, our analysis provides a means for comparing Lempel-Ziv methods with the more powerful, but slower, PPM algorithms. Our main contribution is a comparison of the compression ratio of Lempel-Ziv algorithms with the zeroth order entropy H0. First we show that for low entropy strings LZ78 compression ratio can be much higher than H0. Then, we present a modified algorithm which combines LZ78 with run length encoding and is able to compress efficiently also low entropy strings
  • Keywords
    data compression; entropy codes; runlength codes; LZ77; LZ78; Lempel-Ziv algorithms; Lempel-Ziv methods; PPM algorithms; compression ratio; entropic bounds; input string; low entropy strings; lower bound; modified algorithm; parsing based compression algorithms; probability; run length encoding; statistical modeling algorithm; substring; zeroth order entropy; Algorithm design and analysis; Compression algorithms; Computer science; Encoding; Entropy; Prediction algorithms; Predictive models; Probability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference, 1997. DCC '97. Proceedings
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Print_ISBN
    0-8186-7761-9
  • Type

    conf

  • DOI
    10.1109/DCC.1997.582106
  • Filename
    582106