• DocumentCode
    611049
  • Title

    Resilin: Elastic MapReduce over Multiple Clouds

  • Author

    Iordache, A. ; Morin, Christine ; Parlavantzas, N. ; Feller, E. ; Riteau, Pierre

  • Author_Institution
    Inria Centre Rennes - Bretagne Atlantique, Rennes, France
  • fYear
    2013
  • fDate
    13-16 May 2013
  • Firstpage
    261
  • Lastpage
    268
  • Abstract
    The MapReduce programming model offers a simple and efficient way of performing distributed computation over large data sets. To enable the usage of MapReduce in the cloud, Amazon Web Services offers Elastic MapReduce (EMR), a web service enabling users to easily run MapReduce jobs by leveraging Amazon resources (i.e. compute and storage). EMR takes care of tasks such as resource provisioning, performance tuning, and fault tolerance thus allowing the users to concentrate on the problem to be solved. However, EMR is restricted to Amazon´s resources and is provided at an additional cost. In this paper, we present the design, implementation, and evaluation of Resilin, a novel EMR API-compatible system to perform distributed MapReduce computations. Resilin goes one step beyond Amazon´s proprietary EMR solution and allows users (e.g. companies, scientists) to leverage resources from one or multiple public and/or private clouds. This gives Resilin users the opportunity to perform MapReduce computations over a large number of potentially geographically distributed resources. An extensive experimental evaluation conducted on multiple clusters of the Grid´5000 experimentation test bed shows that Resilin enables the use of geographically distributed resources with only limited impact on MapReduce jobs execution time.
  • Keywords
    Web services; cloud computing; distributed programming; resource allocation; software fault tolerance; software performance evaluation; Amazon Web services; Amazon resources; EMR API-compatible system; Grid´5000 experimentation testbed; MapReduce job execution time; MapReduce programming model; Resilin; cloud computing; distributed MapReduce computations; elastic MapReduce; fault tolerance; geographically distributed resources; multiple clouds; performance tuning; private clouds; public clouds; Benchmark testing; Cloud computing; Clouds; Computational modeling; Distributed databases; Fault tolerance; Programming; Apache Hadoop; Cloud Computing; Elastic MapReduce; Multi-cloud Environment; Virtualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on
  • Conference_Location
    Delft
  • Print_ISBN
    978-1-4673-6465-2
  • Type

    conf

  • DOI
    10.1109/CCGrid.2013.48
  • Filename
    6546101