• DocumentCode
    2734000
  • Title

    Compressing Inverted File Index Using Mixed Delta/Flat Binary Code

  • Author

    Chen, Jinlin ; Zhong, Ping ; Cook, Terry

  • Author_Institution
    Dept. of Comput. Sci., CUNY, Flushing, NY
  • fYear
    2006
  • fDate
    6-6 Dec. 2006
  • Firstpage
    338
  • Lastpage
    343
  • Abstract
    By clustering d-gaps of an inverted list strictly based on a threshold, and then encoding clustered and non-clustered d-gaps using different methods, we can tailor to the specific properties of different d-gaps and achieve better compression ratio. Based on this idea, in this paper we propose a cluster based mixed approach for inverted file index compression: mixed delta/flat binary code. Experiment results show that the new coding scheme achieves better performance in terms of compression ratio comparing to interpolative code which is considered as one of the most efficient bitwise codes at present. Besides, the new code has much lower complexity comparing to interpolative code and therefore enable faster encoding and decoding.
  • Keywords
    binary codes; file organisation; indexing; pattern clustering; bitwise codes; coding scheme; d-gaps clustering; interpolative code; inverted file index compression ratio; inverted list; mixed delta/flat binary code; Binary codes; Computer science; Decoding; Educational institutions; Encoding; Indexing; Probability distribution;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Information Management, 2006 1st International Conference on
  • Conference_Location
    Bangalore
  • Print_ISBN
    1-4244-0682-X
  • Type

    conf

  • DOI
    10.1109/ICDIM.2007.369220
  • Filename
    4221912