• DocumentCode
    2831396
  • Title

    Efficient maintenance scheme of inverted index for large-scale full-text retrieval

  • Author

    Liu, Xiaozhu

  • Author_Institution
    State Key Lab. of Software Eng., Wuhan Univ., Wuhan, China
  • Volume
    1
  • fYear
    2010
  • fDate
    21-24 May 2010
  • Abstract
    Inverted index is the mainstay of modern full-text retrieval systems, and it is a promising way to improve time and space efficiencies with appropriately maintenance scheme of inverted files for huge amount of information management and retrieval. In order to improve the retrieval performance of inverted index in large-scale full-text systems, a time and space efficient random access blocked inverted index (RABI) and an efficient dynamic maintenance scheme (DMS) are proposed in this paper. RABI divides inverted list into blocks and compresses different part of each block with the corresponding compression method to decrease space consumption. Based on RABI, DMS distinguishes between long and short posting lists. Then short posting lists are updated by remerge strategy and long posting lists are updated by hybrid in-place and remerge strategy. Experimental results show that, compared with existed schemes, the proposed scheme greatly averagely reduces space cost, conjunctive Boolean query time, and the cost of on-line index construction.
  • Keywords
    data compression; information retrieval systems; compression method; dynamic maintenance scheme; in-place strategy; inverted index maintenance scheme; large-scale full-text retrieval systems; random access blocked inverted index; remerge strategy; short posting lists; Appropriate technology; Automation; Costs; Information management; Information retrieval; Large-scale systems; Query processing; Software engineering; Space technology; Technology management; index maintenance; information retrieval; inverted index;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Future Computer and Communication (ICFCC), 2010 2nd International Conference on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-5821-9
  • Type

    conf

  • DOI
    10.1109/ICFCC.2010.5497725
  • Filename
    5497725