• DocumentCode
    2662752
  • Title

    Exploiting Heterogeneity for Collective Data Downloading in Volunteer-based Networks

  • Author

    Kim, Jinoh ; Chandra, Abhishek ; Weissman, Jon

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Minnesota Univ., Minneapolis, MN
  • fYear
    2007
  • fDate
    14-17 May 2007
  • Firstpage
    275
  • Lastpage
    282
  • Abstract
    Scientific computing is being increasingly deployed over volunteer-based distributed computing environments consisting of idle resources on donated user machines. A fundamental challenge in these environments is the dissemination of data to the computation nodes, with the successful completion of jobs being driven by the efficiency of collective data download across compute nodes, and not only the individual download times. This paper considers the use of a data network consisting of data distributed across a set of data servers, and focuses on the server selection problem: how do individual nodes select a server for downloading data to minimize the communication makespan - the maximal download time for a data file. Through experiments conducted on a pastry network running on PlanetLab, we demonstrate that nodes in a volunteer-based network are heterogeneous in terms of several metrics, such as bandwidth, load, and capacity, which impact their download behavior. We propose new server selection heuristics that incorporate these metrics, and demonstrate that these heuristics outperform traditional proximity-based server selection, reducing average makespans by at least 30%. We further show that incorporating information about download concurrency avoids overloading servers, and improves performance by about 17-43% over heuristics considering only proximity and bandwidth.
  • Keywords
    grid computing; collective data downloading; distributed computing environments; heterogeneity; idle resources; pastry network; scientific computing; volunteer-based networks; Bandwidth; Biomedical computing; Computer networks; Concurrent computing; Data engineering; Distributed computing; File servers; Grid computing; Network servers; Peer to peer computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing and the Grid, 2007. CCGRID 2007. Seventh IEEE International Symposium on
  • Conference_Location
    Rio De Janeiro
  • Print_ISBN
    0-7695-2833-3
  • Type

    conf

  • DOI
    10.1109/CCGRID.2007.50
  • Filename
    4215391