• DocumentCode
    2061564
  • Title

    Dynamic word based text compression

  • Author

    Ng, K.S. ; Cheng, L.M. ; Wong, C.H.

  • Author_Institution
    Dept. of Electron. Eng., City Univ. of Hong Kong, Kowloon, Hong Kong
  • Volume
    1
  • fYear
    1997
  • fDate
    18-20 Aug 1997
  • Firstpage
    412
  • Abstract
    We propose a dynamic text compression technique with a back searching algorithm and a new storage protocol. Codes being encoded are divided into three types namely copy, literal and hybrid codes. Multiple dictionaries are adopted and each of them has a linked sub-dictionary. Each dictionary has a portion of pre-defined words i.e. the most frequent words and the rest of the entries will depend on the message. A hashing function developed by Pearson (1990) is adopted. It serves two purposes. Firstly, it is used to initialize the dictionary. Secondly, it is used as a quick search to a particular word. By using this scheme, the spaces between words do not need to be considered. At the decoding side, a space character will be appended after each word is decoded. Therefore, the redundancy of space can also be compressed. The result shows that the original message will not be expanded even if we have poor dictionary design
  • Keywords
    backtracking; data compression; document image processing; file organisation; glossaries; image coding; memory protocols; search problems; back searching algorithm; copy codes; decoding; dictionaries; dynamic word based text compression; encoding; hashing function; hybrid codes; literal codes; message; redundancy; space character; storage protocol; Data compression; Decoding; Dictionaries; Probability; Protocols; Road transportation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
  • Conference_Location
    Ulm
  • Print_ISBN
    0-8186-7898-4
  • Type

    conf

  • DOI
    10.1109/ICDAR.1997.619880
  • Filename
    619880