DocumentCode :
1938551
Title :
PPM performance with BWT complexity: a new method for lossless data compression
Author :
Effros, Michelle
Author_Institution :
Dept. of Electr. Eng., California Inst. of Technol., Pasadena, CA, USA
fYear :
2000
fDate :
2000
Firstpage :
203
Lastpage :
212
Abstract :
This work combines a new fast context-search algorithm with the lossless source coding models of PPM to achieve a lossless data compression algorithm with the linear context-search complexity and memory of BWT and Ziv-Lempel codes and the compression performance of PPM-based algorithms. Both sequential and nonsequential encoding are considered. The proposed algorithm yields an average rate of 2.27 bits per character (bpc) on the Calgary corpus, comparing favorably to the 2.33 and 2.34 bpc of PPM5 and PPM* and the 2.43 bpc of BW94 but not matching the 2.12 bpc of PPMZ9, which, at the time of this publication, gives the greatest compression of all algorithms reported on the Calgary corpus results page. The proposed algorithm gives an average rate of 2.14 bpc on the Canterbury corpus. The Canterbury corpus Web page gives average rates of 1.99 bpc for PPMZ9, 2.11 bpc for PPM5, 2.15 bpc for PPM7, and 2.23 bpc for BZIP2 (a BWT-based code) on the same data set
Keywords :
computational complexity; search problems; sequential codes; source coding; BWT complexity; Calgary corpus; PPM performance; Ziv-Lempel codes; linear context-search complexity; lossless data compression; nonsequential encoding; sequential encoding; source coding; Computational complexity; Context modeling; Data compression; Decoding; Encoding; History; Performance loss; Source coding; Testing; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 2000. Proceedings. DCC 2000
Conference_Location :
Snowbird, UT
ISSN :
1068-0314
Print_ISBN :
0-7695-0592-9
Type :
conf
DOI :
10.1109/DCC.2000.838160
Filename :
838160
Link To Document :
بازگشت