DocumentCode :
1682074
Title :
On the need for query-centric unstructured peer-to-peer overlays
Author :
Acosta, William ; Chandra, Surendar
Author_Institution :
Univ. of Notre Dame, Notre Dame, IN
fYear :
2008
Firstpage :
1
Lastpage :
8
Abstract :
Hybrid P2P systems rely on the assumption that sufficient objects exist nearby in order to make the unstructured search component efficient. This availability depends on the object annotations as well as on the terms in the queries. Earlier work assumed that the object annotations and query terms follow Zipf-like long-tail distribution. We show that the queries in real systems exhibit more complex temporal behavior. To support our position, first we analyzed the names and annotations of objects that were stored in two popular P2P sharing systems; Gnutella and Apple iTunes. We showed that the names and annotations exhibited a Zipf like long tail distribution. The long tail meant that over 98% of the objects were insufficiently replicated (less than 0.1% of the peers). We also analyzed a query trace of the Gnutella network and identified the popularity distribution of the terms used in the queries. We showed that the set of popular query terms remained stable over time and exhibited a similarity of over 90%. We also showed that despite the Zipf popularity distributions of both query terms and file annotation terms, there was little similarity over time (<20%) between popular file annotation terms and popular file terms. Prior P2P search performance analysis did not take this mismatch between the query terms and object annotations into account and thus overestimated the system performance. There is a need to develop unstructured P2P systems that are aware of the temporal mismatch of the object and query popularity distributions.
Keywords :
peer-to-peer computing; query formulation; Apple iTunes; Gnutella network; Zipf long tail distribution; file annotation terms; hybrid P2P systems; object popularity distributions; performance analysis; query popularity distributions; query terms; Buildings; Data structures; Floods; Peer to peer computing; Performance analysis; Probability distribution; Routing; System performance; Transcoding;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
Conference_Location :
Miami, FL
ISSN :
1530-2075
Print_ISBN :
978-1-4244-1693-6
Electronic_ISBN :
1530-2075
Type :
conf
DOI :
10.1109/IPDPS.2008.4536197
Filename :
4536197
Link To Document :
بازگشت