Title :
Searching very large bodies of data using a transparent peer-to-peer proxy
Author :
Taylor, Mike ; Cromme, Marc
Author_Institution :
Index Data, London, UK
Abstract :
While individual data stores are increasingly large, the aggregate size of the Internet dwarfs them all and always will. We consider an approach to searching rich documents across a very large network of individual data stores using a transparent peer-to-peer proxy. This approach is dependent on the use of a standardised search-and-retrieve protocol sufficiently rich to enable semantics to be induced on both its documents and its queries. Candidate protocols include the mature Z39.50 and the more recent SRW/U, of which the latter is considered more "Web-friendly". Networks of the peers underlying this approach to large-repository search and retrieval may take on widely differing topologies, and queries may be routed in widely different ways. Optimal values of tuning parameters may be determined using an evolutionary system in which simulations of different configurations compete against each other. The European collaborative project Alvis is using the approach outlined in this paper to build a semantic peer-to-peer search engine aggregated across multiple subject-specific repositories. Among the problems still to be solved, the matter of how to merge results from multiple peers is the most difficult.
Keywords :
Internet; information retrieval; peer-to-peer computing; protocols; search engines; Internet; candidate protocol; data stores; document searching; evolutionary system; retrieve protocol; search engine; search protocol; transparent peer-to-peer proxy; ANSI standards; Access protocols; Aggregates; Bit error rate; ISO standards; Internet; Libraries; Network topology; Peer to peer computing; Standards development;
Conference_Titel :
Database and Expert Systems Applications, 2005. Proceedings. Sixteenth International Workshop on
Print_ISBN :
0-7695-2424-9
DOI :
10.1109/DEXA.2005.170