• DocumentCode
    610344
  • Title

    LSII: An indexing structure for exact real-time search on microblogs

  • Author

    Lingkun Wu ; Wenqing Lin ; Xiaokui Xiao ; Yabo Xu

  • Author_Institution
    Nanyang Technol. Univ., Singapore, Singapore
  • fYear
    2013
  • fDate
    8-12 April 2013
  • Firstpage
    482
  • Lastpage
    493
  • Abstract
    Indexing microblogs for real-time search is challenging given the efficiency issue caused by the tremendous speed at which new microblogs are created by users. Existing approaches address this efficiency issue at the cost of query accuracy, as they either (i) exclude a significant portion of microblogs from the index to reduce update cost or (ii) rank microblogs mostly by their timestamps (without sufficient consideration of their relevance to the queries) to enable append-only index insertion. As a consequence, the search results returned by the existing approaches do not satisfy the users who demand timely and high-quality search results. To remedy this deficiency, we propose the Log-Structured Inverted Indices (LSII), a structure for exact real-time search on microblogs. The core of LSII is a sequence of inverted indices with exponentially increasing sizes, such that new microblogs are (i) first inserted into the smallest index and (ii) later moved into the larger indices in a batch manner. The batch insertion mechanism leads to a small amortize update cost for each new microblog, without significantly degrading query performance. We present a comprehensive study on LSII, exploring various design options to strike a good balance between query and update performance. In addition, we propose extensions of LSII to support personalized search and to exploit multi-threading for performance improvement. Extensive experiments demonstrate the efficiency of LSII with experiments on real data.
  • Keywords
    Web sites; indexing; LSII; append-only index insertion; batch insertion mechanism; exact real-time search; high-quality search results; indexing microblogs; indexing structure; log-structured inverted indices; microblog ranking; query performance; Corporate acquisitions; Indexing; Query processing; Real-time systems; Servers; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2013 IEEE 29th International Conference on
  • Conference_Location
    Brisbane, QLD
  • ISSN
    1063-6382
  • Print_ISBN
    978-1-4673-4909-3
  • Electronic_ISBN
    1063-6382
  • Type

    conf

  • DOI
    10.1109/ICDE.2013.6544849
  • Filename
    6544849