• DocumentCode
    2666898
  • Title

    Randomized protocols for duplicate elimination in peer-to-peer storage systems

  • Author

    Ferreira, Ronaldo A. ; Ramanathan, Murali Krishna ; Grama, Ananth ; Jagannathan, Suresh

  • Author_Institution
    Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN, USA
  • fYear
    2005
  • fDate
    31 Aug.-2 Sept. 2005
  • Firstpage
    201
  • Lastpage
    208
  • Abstract
    Distributed peer-to-peer storage systems rely on voluntary participation of peers to effectively manage a storage pool. Files are generally replicated in several sites to provide acceptable levels of availability. If disk space on these peers is not carefully monitored and provisioned, the system may not be able to provide availability for certain files. In particular, identification and elimination of redundant data are important problems that may arise in long-lived systems. Scalability and availability are competing goals in these networks: scalability concerns would dictate aggressive elimination of replicas, while availability considerations would argue conversely. In this paper, the authors provided a novel and efficient solution that addresses both these goals with respect to management of redundant data. Specifically, the problem of duplicate elimination in the context of systems connected over an unstructured peer-to-peer network in which there is no a priori binding between an object and its location was addressed. A new randomized protocol was proposed to solve this problem in a scalable and decentralized fashion that does not compromise availability requirements of the application. Performance results using both large-scale simulations, and a prototype built on PlanetLab, demonstrate that the protocols provide high probabilistic guarantees of success, while incurring minimal administrative overheads.
  • Keywords
    information retrieval systems; peer-to-peer computing; protocols; replicated databases; distributed peer-to-peer storage system; duplicate elimination; randomized protocol; redundant data management; unstructured peer-to-peer network; Availability; Distributed computing; Large-scale systems; Monitoring; Peer to peer computing; Protocols; Scalability; Virtual prototyping;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Peer-to-Peer Computing, 2005. P2P 2005. Fifth IEEE International Conference on
  • Print_ISBN
    0-7695-2376-5
  • Type

    conf

  • DOI
    10.1109/P2P.2005.30
  • Filename
    1551042