Title :
A character elimination algorithm for lossless data compression
fDate :
6/24/1905 12:00:00 AM
Abstract :
Summary form only given. We present a detailed description of a lossless compression algorithm intended for use on files with non-uniform character distributions. This algorithm takes advantage of the relatively small distances between character occurrences once we remove the less frequent characters. This allows it to create a compressed version of the file that, when decompressed, is an exact copy of the file that was compressed. We begin by performing a Burrows-Wheeler (1994) Transform (BWT) on the file. The algorithm scans this BWT file to create a character frequency model for the compression phase. To deal with the issue of bit encoding, we write every number as a byte or sequence of bytes to the compressed file and run an arithmetic encoder after the file has been compiled.
Keywords :
arithmetic codes; data communication; encoding; transform coding; transforms; Burrows-Wheeler transform; arithmetic encoder; bit encoding; character elimination algorithm; character frequency model; compressed file; lossless data compression; Arithmetic; Compression algorithms; Data compression; Decoding; Encoding; Frequency; Writing;
Conference_Titel :
Data Compression Conference, 2002. Proceedings. DCC 2002
Print_ISBN :
0-7695-1477-4
DOI :
10.1109/DCC.2002.1000000