• DocumentCode
    1595119
  • Title

    Associative search in peer to peer networks: harnessing latent semantics

  • Author

    Cohen, Emmanuel ; Fiat, A. ; Kaplan, Haim

  • Author_Institution
    Res. Labs., AT&T, Florham park, NJ, USA
  • Volume
    2
  • fYear
    2003
  • Firstpage
    1261
  • Abstract
    The success of a P2P file-sharing network highly depends on the scalability and versatility of its search mechanism. Two particularly desirable search features are scope (ability to find infrequent items) and support for partial-match queries (queries that contain typos or include a subset of keywords). While centralized-index architectures (such as Napster) can support both these features, existing decentralized architectures seem to support at most one: prevailing unstructured P2P protocols (such as Gnutella and FastTrack) deploy a "blind" search mechanism where the set of peers probed is unrelated to the query; thus they support partial-match queries but have limited scope. On the other extreme, the recently-proposed distributed hash tables (DHTs) such as CAN and CHORD, couple index location with the item\´s hash value, and thus have good scope but can not effectively support partial-match queries. Another hurdle to DHTs deployment is their tight control of the overlay structure and the information (part of the index) each peer maintains, which makes them more sensitive to failures and frequent joins and disconnects. We develop a new class of decentralized P2P architectures. Our design is based on unstructured architectures such as gnutella and FastTrack, and retains many of their appealing properties including support for partial match queries, and relative resilience to peer failures. Yet, we obtain orders of magnitude improvement in the efficiency of locating rare items. Our approach exploits associations inherent in human selections to steer the search process to peers that are more likely to have an answer to the query. We demonstrate the potential of associative search using models, analysis, and simulations.
  • Keywords
    Internet; file organisation; information retrieval; protocols; search problems; FastTrack; Gnutella; Napster; associative search; blind search mechanism; centralized-index architectures; decentralized peer to peer architectures; distributed hash tables; infrequent items; latent semantics; overlay structure; partial-match queries; peer to peer file-sharing network; peer to peer protocol; scalability; versatility; Analytical models; Computer architecture; Humans; IP networks; Intelligent networks; Peer to peer computing; Probes; Protocols; Resilience; Scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    INFOCOM 2003. Twenty-Second Annual Joint Conference of the IEEE Computer and Communications. IEEE Societies
  • Conference_Location
    San Francisco, CA
  • ISSN
    0743-166X
  • Print_ISBN
    0-7803-7752-4
  • Type

    conf

  • DOI
    10.1109/INFCOM.2003.1208962
  • Filename
    1208962