Title :
Simple lossless preprocessing algorithms for text compression
Author :
Robert, L. ; Nadarajan, R.
Author_Institution :
Dept. of Comput. Sci., Gov. Arts Coll., Coimbatore
fDate :
2/1/2009 12:00:00 AM
Abstract :
Lossless data compression researchers have developed highly sophisticated approaches, such as Huffman encoding, arithmetic coding, the Lempel-Ziv family, prediction by partial matching and Burrow-Wheeler transform based algorithms. One approach for attaining better compression is to develop generic, reversible transformation that can be applied to a source text that improves an existing compression algorithm´s ability to compress. A few reversible transformation techniques that give better compression ratios are presented. A method, which transforms a text file into intermediate file with minimum possible byte values, is proposed. An attempt has been made to reduce the number of possible bytes that appear after every byte in the source file. This increases backend algorithm´s compression performance.
Keywords :
Huffman codes; arithmetic codes; data compression; text analysis; transform coding; Burrow-Wheeler transform; Huffman encoding; Lempel-Ziv family; arithmetic coding; lossless data compression; lossless preprocessing algorithms; text compression;
Journal_Title :
Software, IET
DOI :
10.1049/iet-sen:20070106