DocumentCode
610344
Title
LSII: An indexing structure for exact real-time search on microblogs
Author
Lingkun Wu ; Wenqing Lin ; Xiaokui Xiao ; Yabo Xu
Author_Institution
Nanyang Technol. Univ., Singapore, Singapore
fYear
2013
fDate
8-12 April 2013
Firstpage
482
Lastpage
493
Abstract
Indexing microblogs for real-time search is challenging given the efficiency issue caused by the tremendous speed at which new microblogs are created by users. Existing approaches address this efficiency issue at the cost of query accuracy, as they either (i) exclude a significant portion of microblogs from the index to reduce update cost or (ii) rank microblogs mostly by their timestamps (without sufficient consideration of their relevance to the queries) to enable append-only index insertion. As a consequence, the search results returned by the existing approaches do not satisfy the users who demand timely and high-quality search results. To remedy this deficiency, we propose the Log-Structured Inverted Indices (LSII), a structure for exact real-time search on microblogs. The core of LSII is a sequence of inverted indices with exponentially increasing sizes, such that new microblogs are (i) first inserted into the smallest index and (ii) later moved into the larger indices in a batch manner. The batch insertion mechanism leads to a small amortize update cost for each new microblog, without significantly degrading query performance. We present a comprehensive study on LSII, exploring various design options to strike a good balance between query and update performance. In addition, we propose extensions of LSII to support personalized search and to exploit multi-threading for performance improvement. Extensive experiments demonstrate the efficiency of LSII with experiments on real data.
Keywords
Web sites; indexing; LSII; append-only index insertion; batch insertion mechanism; exact real-time search; high-quality search results; indexing microblogs; indexing structure; log-structured inverted indices; microblog ranking; query performance; Corporate acquisitions; Indexing; Query processing; Real-time systems; Servers; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering (ICDE), 2013 IEEE 29th International Conference on
Conference_Location
Brisbane, QLD
ISSN
1063-6382
Print_ISBN
978-1-4673-4909-3
Electronic_ISBN
1063-6382
Type
conf
DOI
10.1109/ICDE.2013.6544849
Filename
6544849
Link To Document