DocumentCode
3145269
Title
Text compression using several Huffman trees
Author
Basu, Dipak
Author_Institution
Sch. of Comput & Syst. Sci., Jawaharlal Nehru Univ., New Delhi, India
fYear
1991
fDate
8-11 Apr 1991
Firstpage
452
Abstract
Summary form only given. Noticing that a Huffman code is independent of the order in which the characters appear in the text, the author views the source-text as columns of characters where words appear as rows. The character frequency tables are compiled with respect to character positions within the column of words, ie., a frequency table is constructed for the first character of all the words, and another for the second character, and so on. Word delimiter (e.g. a space or punctuation mark) are used as word endings. Position dependent code tables, one for each column, are computed using the Huffman algorithm. The text is scanned from left to right and the code substituted for the characters from the corresponding tables; a delimiter serves to initialize (reset) the correspondence. Using several coding trees one obtains a greater degree of compression without performing clustering computations
Keywords
codes; data compression; trees (mathematics); Huffman code; Huffman trees; character frequency tables; coding trees; delimiter; text compression; Clustering algorithms; Decoding; Encoding; Frequency; Greedy algorithms; Heuristic algorithms; Huffman coding; Information theory; Partitioning algorithms; Performance analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Compression Conference, 1991. DCC '91.
Conference_Location
Snowbird, UT
Print_ISBN
0-8186-9202-2
Type
conf
DOI
10.1109/DCC.1991.213309
Filename
213309
Link To Document