DocumentCode :
3145269
Title :
Text compression using several Huffman trees
Author :
Basu, Dipak
Author_Institution :
Sch. of Comput & Syst. Sci., Jawaharlal Nehru Univ., New Delhi, India
fYear :
1991
fDate :
8-11 Apr 1991
Firstpage :
452
Abstract :
Summary form only given. Noticing that a Huffman code is independent of the order in which the characters appear in the text, the author views the source-text as columns of characters where words appear as rows. The character frequency tables are compiled with respect to character positions within the column of words, ie., a frequency table is constructed for the first character of all the words, and another for the second character, and so on. Word delimiter (e.g. a space or punctuation mark) are used as word endings. Position dependent code tables, one for each column, are computed using the Huffman algorithm. The text is scanned from left to right and the code substituted for the characters from the corresponding tables; a delimiter serves to initialize (reset) the correspondence. Using several coding trees one obtains a greater degree of compression without performing clustering computations
Keywords :
codes; data compression; trees (mathematics); Huffman code; Huffman trees; character frequency tables; coding trees; delimiter; text compression; Clustering algorithms; Decoding; Encoding; Frequency; Greedy algorithms; Heuristic algorithms; Huffman coding; Information theory; Partitioning algorithms; Performance analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 1991. DCC '91.
Conference_Location :
Snowbird, UT
Print_ISBN :
0-8186-9202-2
Type :
conf
DOI :
10.1109/DCC.1991.213309
Filename :
213309
Link To Document :
بازگشت