Title :
Exploiting the Properties of Query Workload and File Name Distributions to Improve P2P Synopsis-Based Searches
Author :
Acosta, W. ; Chandra, Swarup
Author_Institution :
Univ. of Notre Dame, Notre Dame
Abstract :
Modern P2P systems use hybrid searches to improve search efficiency. They use a synopsis of neighborhood content to determine whether to use a structured or unstructured overlay to satisfy a particular query. Because of their size restrictions, a synopsis cannot hold all the terms from every file in the neighborhood. The challenge is to choose the terms that should be represented in the synopsis. In this work, we investigated the distribution of query terms and file terms in Gnutella networks. We observed that there was a mismatch between terms that were popular among file names and the terms that were popular among the queries generated by the user. Because the query behavior changed with time, a synopsis based on only static set of popular file terms was ill-suited to support efficient searches. We used these observations to design a synopsis creation algorithm that dynamically adapted to the query workload and selected terms for the synopsis to reflect popular terms in both the query workload and file distribution. Our preliminary experimental analysis showed that our Query-Adaptive synopsis improved the search performance over the traditional file-based synopsis model.
Keywords :
peer-to-peer computing; query processing; Gnutella network; P2P synopsis-based search; file name distribution; file-based synopsis model; query workload; query-adaptive synopsis; synopsis creation algorithm; Algorithm design and analysis; Communications Society; Computer science; Filters; Floods; Heuristic algorithms; Indexing; Monitoring; Performance analysis; Routing;
Conference_Titel :
INFOCOM 2008. The 27th Conference on Computer Communications. IEEE
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4244-2025-4
DOI :
10.1109/INFOCOM.2008.317