DocumentCode
2284422
Title
Fast similarity search in peer-to-peer networks
Author
Bocek, Thomas ; Hunt, Ela ; Hausheer, David ; Stiller, Burkhard
Author_Institution
Dept. of Inf. IFI, Univ. of Zurich, Zurich
fYear
2008
fDate
7-11 April 2008
Firstpage
240
Lastpage
247
Abstract
Peer-to-peer (P2P) systems show numerous advantages over centralized systems, such as load balancing, scalability, and fault tolerance, and they require certain functionality, such as search, repair, and message and data transfer. In particular, structured P2P networks perform an exact search in logarithmic time proportional to the number of peers. However, keyword similarity search in a structured P2P network remains a challenge. Similarity search for service discovery can significantly improve service management in a distributed environment. As services are often described informally in text form, keyword similarity search can find the required services or data items more reliably. This paper presents a fast similarity search algorithm for structured P2P systems. The new algorithm, called P2P fast similarity search (P2PFastSS), finds similar keys in any distributed hash table (DHT) using the edit distance metric, and is independent of the underlying P2P routing algorithm. Performance analysis shows that P2PFastSS carries out a similarity search in time proportional to the logarithm of the number of peers. Simulations on PlanetLab confirm these results and show that a similarity search with 34,000 peers performs in less than three seconds on average. Thus, P2PFastSS is suitable for similarity search in large-scale network infrastructures, such as service description matching in service discovery or searching for similar terms in P2P storage networks.
Keywords
cryptography; fault tolerance; file organisation; peer-to-peer computing; resource allocation; telecommunication network routing; P2P routing algorithm; P2PFastSS; PlanetLab; centralized systems; data transfer; distributed hash table; edit distance metric; fast similarity search; fault tolerance; keyword similarity search; load balancing; peer-to-peer networks; service discovery; Computer science; Data engineering; Electronic mail; Fault tolerant systems; Informatics; Laboratories; Load management; Peer to peer computing; Routing; Scalability; DHT; P2P; service discovery; similarity search;
fLanguage
English
Publisher
ieee
Conference_Titel
Network Operations and Management Symposium, 2008. NOMS 2008. IEEE
Conference_Location
Salvador, Bahia
ISSN
1542-1201
Print_ISBN
978-1-4244-2065-0
Electronic_ISBN
1542-1201
Type
conf
DOI
10.1109/NOMS.2008.4575140
Filename
4575140
Link To Document