DocumentCode
3294314
Title
Similarity Searching in Peer-to-Peer Databases
Author
Bhattacharya, Indrajit ; Kashyap, Srinivas R. ; Parthasarathy, Srinivasan
Author_Institution
Dept. of Comput. Sci., Maryland Univ., College Park, MD
fYear
2005
fDate
10-10 June 2005
Firstpage
329
Lastpage
338
Abstract
We consider the problem of handling similarity queries in peer-to-peer databases. We propose an indexing and searching mechanism which, given a query object, returns the set of objects in the database that are semantically related to the query. We propose an indexing scheme which clusters data such that semantically related objects are partitioned into a small set of clusters, allowing for a simple and efficient similarity search strategy. Our indexing scheme also decouples object and node locations. Our adaptive replication and randomized lookup schemes exploit this feature and ensure that the number of copies of an object is proportional to its popularity and all replicas are equally likely to serve a given query, thus achieving perfect load balancing. The techniques developed in this work are oblivious to the underlying DHT topology and can be implemented on a variety of structured overlays such as CAN, CHORD, Pastry, and Tapestry. We also present DHT-independent analytical guarantees for the performance of our algorithms in terms of search accuracy, cost, and load-balance; the experimental results from our simulations confirm the insights derived from these analytical models
Keywords
database indexing; peer-to-peer computing; query formulation; query processing; CAN; CHORD; DHT topology; Pastry; Tapestry; adaptive replication; data clustering; data indexing; load balancing; peer-to-peer databases; query object; randomized lookup; similarity query handling; similarity search strategy; similarity searching; Analytical models; Computer science; Databases; Educational institutions; Indexing; Information retrieval; Load management; Peer to peer computing; Performance analysis; Topology;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Computing Systems, 2005. ICDCS 2005. Proceedings. 25th IEEE International Conference on
Conference_Location
Columbus, OH
ISSN
1063-6927
Print_ISBN
0-7695-2331-5
Type
conf
DOI
10.1109/ICDCS.2005.74
Filename
1437096
Link To Document