• DocumentCode
    3145269
  • Title

    Text compression using several Huffman trees

  • Author

    Basu, Dipak

  • Author_Institution
    Sch. of Comput & Syst. Sci., Jawaharlal Nehru Univ., New Delhi, India
  • fYear
    1991
  • fDate
    8-11 Apr 1991
  • Firstpage
    452
  • Abstract
    Summary form only given. Noticing that a Huffman code is independent of the order in which the characters appear in the text, the author views the source-text as columns of characters where words appear as rows. The character frequency tables are compiled with respect to character positions within the column of words, ie., a frequency table is constructed for the first character of all the words, and another for the second character, and so on. Word delimiter (e.g. a space or punctuation mark) are used as word endings. Position dependent code tables, one for each column, are computed using the Huffman algorithm. The text is scanned from left to right and the code substituted for the characters from the corresponding tables; a delimiter serves to initialize (reset) the correspondence. Using several coding trees one obtains a greater degree of compression without performing clustering computations
  • Keywords
    codes; data compression; trees (mathematics); Huffman code; Huffman trees; character frequency tables; coding trees; delimiter; text compression; Clustering algorithms; Decoding; Encoding; Frequency; Greedy algorithms; Heuristic algorithms; Huffman coding; Information theory; Partitioning algorithms; Performance analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference, 1991. DCC '91.
  • Conference_Location
    Snowbird, UT
  • Print_ISBN
    0-8186-9202-2
  • Type

    conf

  • DOI
    10.1109/DCC.1991.213309
  • Filename
    213309