• DocumentCode
    81080
  • Title

    BestPeer++: A Peer-to-Peer Based Large-Scale Data Processing Platform

  • Author

    Gang Chen ; Tianlei Hu ; Dawei Jiang ; Peng Lu ; Kian-Lee Tan ; Hoang Tam Vo ; Sai Wu

  • Author_Institution
    Coll. of Comput. Sci., Zhejiang Univ., Hangzhou, China
  • Volume
    26
  • Issue
    6
  • fYear
    2014
  • fDate
    Jun-14
  • Firstpage
    1316
  • Lastpage
    1331
  • Abstract
    The corporate network is often used for sharing information among the participating companies and facilitating collaboration in a certain industry sector where companies share a common interest. It can effectively help the companies to reduce their operational costs and increase the revenues. However, the inter-company data sharing and processing poses unique challenges to such a data management system including scalability, performance, throughput, and security. In this paper, we present BestPeer++, a system which delivers elastic data sharing services for corporate network applications in the cloud based on BestPeer - a peer-to-peer (P2P) based data management platform. By integrating cloud computing, database, and P2P technologies into one system, BestPeer++ provides an economical, flexible and scalable platform for corporate network applications and delivers data sharing services to participants based on the widely accepted pay-as-you-go business model. We evaluate BestPeer++ on Amazon EC2 Cloud platform. The benchmarking results show that BestPeer++ outperforms HadoopDB, a recently proposed large-scale data processing system, in performance when both systems are employed to handle typical corporate network workloads. The benchmarking results also demonstrate that BestPeer++ achieves near linear scalability for throughput with respect to the number of peer nodes.
  • Keywords
    business data processing; cloud computing; data handling; intranets; peer-to-peer computing; public domain software; Amazon EC2 cloud platform; BestPeer++; HadoopDB; P2P based data management platform; P2P technology; cloud computing; corporate network workloads; data management system; elastic data sharing services; industry sector; inter-company data sharing; linear scalability; operational cost reduction; pay-as-you-go business model; peer-to-peer based large-scale data processing platform; Database systems; Peer to peer computing; Query processing; Servers; MapReduce; Peer-to-peer systems; cloud computing; index; query processing;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2012.236
  • Filename
    6365635