DocumentCode :
1478267
Title :
TSS: Efficient Term Set Search in Large Peer-to-Peer Textual Collections
Author :
Chen, Hanhua ; Yan, Jun ; Jin, Hai ; Liu, Yunhao ; Ni, Lionel M.
Author_Institution :
Sch. of Comput. Sci. & Technol., Huazhong Univ. of Sci. & Technol. (HUST), Wuhan, China
Volume :
59
Issue :
7
fYear :
2010
fDate :
7/1/2010 12:00:00 AM
Firstpage :
969
Lastpage :
980
Abstract :
Previous multikeyword search in DHT-based P2P systems often relies on multiple single keyword search operations, suffering from unacceptable traffic cost and poor accuracy. Precomputing term-set-based index can significantly reduce the cost but needs exponentially growing index size. Based on our observations that 1) queries are typically short and 2) users usually have limited interests, we propose a novel index pruning method, called TSS. By solely publishing the most relevant term sets from documents on the peers, TSS provides comparable search performance with a centralized solution, while the index size is reduced from exponential to the scale of O(nlog(n)). We evaluate this design through comprehensive trace-driven simulations using the TREC WT10G data collection and the query log of a major commercial search engine.
Keywords :
computational complexity; peer-to-peer computing; query processing; DHT-based P2P systems; TREC WT10G data collection; TSS method; computational complexity; distributed hash table; index pruning method; multikeyword search; peer-to-peer textual collections; search engine; search performance; term set search; Availability; Costs; IP networks; Indexing; Internet; Keyword search; Large-scale systems; Peer to peer computing; Publishing; Scalability; Search engines; Peer-to-peer; multikeyword searching; ranking.;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.2010.81
Filename :
5453340
Link To Document :
بازگشت