Title :
Compression of a Set of Strings
Author :
Lánský, Jan ; Zemlicka, Michal
Author_Institution :
Fac. of Math. & Phys., Charles Univ., Prague
Abstract :
Dictionary is used by many compression methods. Some of them put the dictionary into the compressed message. In such cases the improvements in dictionary compression can improve the performance of the compression methods. We have focused on compression of dictionaries of words or syllables. The dictionary compression is often based on tree representation of the dictionary. We suppose that proper encoding of the tree can save a lot of space. We have therefore focused on minimizing the stored information corresponding to a tree node. We store only the number of children encoded by Elias gamma code, the difference of encoded character from the left sibling encoded by Elias delta code, and the information whether given node represents a dictionary item stored in a single bit. The distance from left brother is skipped for the root and the information whether the node represents a dictionary item is skipped by the leaves.
Keywords :
data compression; encoding; dictionary compression; encoding; string compression; tree representation; Data compression; Dictionaries; Encoding; Information theory; Mathematics; Physics;
Conference_Titel :
Data Compression Conference, 2007. DCC '07
Conference_Location :
Snowbird, UT
Print_ISBN :
0-7695-2791-4
DOI :
10.1109/DCC.2007.25