DocumentCode
2061564
Title
Dynamic word based text compression
Author
Ng, K.S. ; Cheng, L.M. ; Wong, C.H.
Author_Institution
Dept. of Electron. Eng., City Univ. of Hong Kong, Kowloon, Hong Kong
Volume
1
fYear
1997
fDate
18-20 Aug 1997
Firstpage
412
Abstract
We propose a dynamic text compression technique with a back searching algorithm and a new storage protocol. Codes being encoded are divided into three types namely copy, literal and hybrid codes. Multiple dictionaries are adopted and each of them has a linked sub-dictionary. Each dictionary has a portion of pre-defined words i.e. the most frequent words and the rest of the entries will depend on the message. A hashing function developed by Pearson (1990) is adopted. It serves two purposes. Firstly, it is used to initialize the dictionary. Secondly, it is used as a quick search to a particular word. By using this scheme, the spaces between words do not need to be considered. At the decoding side, a space character will be appended after each word is decoded. Therefore, the redundancy of space can also be compressed. The result shows that the original message will not be expanded even if we have poor dictionary design
Keywords
backtracking; data compression; document image processing; file organisation; glossaries; image coding; memory protocols; search problems; back searching algorithm; copy codes; decoding; dictionaries; dynamic word based text compression; encoding; hashing function; hybrid codes; literal codes; message; redundancy; space character; storage protocol; Data compression; Decoding; Dictionaries; Probability; Protocols; Road transportation;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
Conference_Location
Ulm
Print_ISBN
0-8186-7898-4
Type
conf
DOI
10.1109/ICDAR.1997.619880
Filename
619880
Link To Document