Title :
Fast compression with a static model in high-order entropy
Author :
Foschini, Luca ; Grossi, Roberto ; Gupta, Ankur ; Vitter, Jeffrey Scott
Author_Institution :
Scuola Superiore Sant´´Anna, Pisa, Italy
Abstract :
We report on a simple encoding format called wzip for decompressing block-sorting transforms, such as the Burrows-Wheeler transform (BWT). Our compressor uses the simple notions of gamma encoding and RLE, organized with a wavelet tree, to achieve a slightly better compression ratio than bzip2 in less time. In fact, our compression/decompression time is dependent on Hh, the hth order empirical entropy. This relationship of performance to the compressibility of data is a key new idea among compression algorithms. Another key contribution of our compressor is its simplicity. Our compressor can also operate as a full-text index with a small amount of data, while still preserving backward compatibility with just the compressor.
Keywords :
data compression; encoding; entropy; indexing; transform coding; wavelet transforms; Burrows-Wheeler transform; RLE; block-sorting transform decompression; compression ratio; data compressibility; encoding format; fast compression; full-text indexing; gamma encoding; high-order entropy; static model; wavelet tree; wzip; Biological information theory; Biology computing; Compression algorithms; Costs; Decoding; Dictionaries; Encoding; Entropy; Military computing; Statistics;
Conference_Titel :
Data Compression Conference, 2004. Proceedings. DCC 2004
Print_ISBN :
0-7695-2082-0
DOI :
10.1109/DCC.2004.1281451