• DocumentCode
    2766870
  • Title

    Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds

  • Author

    Chiba, Tatsuhiro ; den Burger, Mathijs ; Kielmann, Thilo ; Matsuoka, Satoshi

  • fYear
    2010
  • fDate
    17-20 May 2010
  • Firstpage
    5
  • Lastpage
    14
  • Abstract
    Data-intensive parallel applications on clouds need to deploy large data sets from the cloud´s storage facility to all compute nodes as fast as possible. Many multicast algorithms have been proposed for clusters and grid environments. The most common approach is to construct one or more spanning trees based on the network topology and network monitoring data in order to maximize available bandwidth and avoid bottleneck links. However, delivering optimal performance becomes difficult once the available bandwidth changes dynamically. In this paper, we focus on Amazon EC2/S3 (the most commonly used cloud platform today) and propose two high performance multicast algorithms. These algorithms make it possible to efficiently transfer large amounts of data stored in Amazon S3 to multiple Amazon EC2 nodes. The three salient features of our algorithms are (1) to construct an overlay network on clouds without network topology information, (2) to optimize the total throughput dynamically, and (3) to increase the download throughput by letting nodes cooperate with each other. The two algorithms differ in the way nodes cooperate: the first `non-steal´ algorithm lets each node download an equal share of all data, while the second `steal´ algorithm uses work stealing to counter the effect of heterogeneous download bandwidth. As a result, all nodes can download files from S3 quickly, even when the network performance changes while the algorithm is running. We evaluate our algorithms on EC2/S3, and show that they are scalable and consistently achieve high throughput. Both algorithms perform much better than having each node downloading all data directly from S3.
  • Keywords
    Application software; Bandwidth; Cloud computing; Clustering algorithms; Concurrent computing; Distributed computing; Grid computing; Multicast algorithms; Network topology; Throughput; Amazon EC2/S3; Cloud Computing; Multicast Algorithm; Work Stealing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference on
  • Conference_Location
    Melbourne, Australia
  • Print_ISBN
    978-1-4244-6987-1
  • Type

    conf

  • DOI
    10.1109/CCGRID.2010.63
  • Filename
    5493497