Title :
Comparison of Text Models for BWT
Author :
Lansky, J. ; Chernik, Katsiaryna ; Vlckova, Z.
Author_Institution :
Charles Univ., Prague
Abstract :
Burrows-Wheeler Transform (BWT) is a compression method, which reorders an input string into the the form, which is preferable to another compression. Usually, Move-To-Front transform and then Huffman coding is used to the permutated string. This work is to compare the single file parsing methods used on input text files by means of Burrows-Wheeler Transform for different languages (English, Czech, and German). Since present methods based on BWT use different block sizes and moreover, they are oriented to the compression of one element type - what makes harder the mutual comparison, we modified the method to be able to compress using all required elements and to have the block size 5 MB, which is more than size of any test input file.
Keywords :
Huffman codes; data compression; text analysis; transform coding; Burrows-Wheeler transform; Huffman coding; compression method; file parsing method; move-to-front transform; Data compression; Dictionaries; Encoding; Huffman coding; Mathematical model; Mathematics; Natural languages; Physics; Sorting; Testing;
Conference_Titel :
Data Compression Conference, 2007. DCC '07
Conference_Location :
Snowbird, UT
Print_ISBN :
0-7695-2791-4
DOI :
10.1109/DCC.2007.21