• DocumentCode
    625675
  • Title

    Cura: A Cost-Optimized Model for MapReduce in a Cloud

  • Author

    Palanisamy, Balaji ; Singh, Ashutosh ; Langston, Bryan

  • Author_Institution
    Coll. of Comput., Georgia Tech, Atlanta, GA, USA
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    1275
  • Lastpage
    1286
  • Abstract
    We propose a new MapReduce cloud service model, Cura, for data analytics in the cloud. We argue that performing MapReduce analytics in existing cloud service models - either using a generic compute cloud or a dedicated MapReduce cloud - is inadequate and inefficient for production workloads. Existing services require users to select a number of complex cluster and job parameters while simultaneously forcing the cloud provider to use those potentially sub-optimal configurations resulting in poor resource utilization and higher cost. In contrast Cura leverages MapReduce profiling to automatically create the best cluster configuration for the jobs so as to obtain a global resource optimization from the provider perspective. Secondly, to better serve modern MapReduce workloads which constitute a large proportion of interactive real-time jobs, Cura uses a unique instant VM allocation technique that reduces response times by up to 65%. Thirdly, our system introduces deadline-awareness which, by delaying execution of certain jobs, allows the cloud provider to optimize its global resource allocation and reduce costs further. Cura also benefits from a number of additional performance enhancements including cost-aware resource provisioning, VMaware scheduling and online virtual machine reconfiguration. Our experimental results using Facebook-like workload traces show that along with response time improvements, our techniques lead to more than 80% reduction in the compute infrastructure cost of the cloud data center.
  • Keywords
    cloud computing; computer centres; resource allocation; software cost estimation; virtual machines; Cura; Facebook-like workload; MapReduce analytics; MapReduce cloud service model; MapReduce profiling; MapReduce workload; VM allocation technique; VMaware scheduling; cloud data center; cloud provider; cluster configuration; complex cluster; cost reduction; cost-aware resource provisioning; cost-optimized model; data analytics; deadline-awareness; dedicated MapReduce cloud; generic compute cloud; global resource allocation; global resource optimization; interactive real-time job; job execution; job parameter; online virtual machine reconfiguration; production workload; resource utilization; suboptimal configuration; Adaptation models; Computational modeling; Optimization; Production; Resource management; Schedules; Time factors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on
  • Conference_Location
    Boston, MA
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4673-6066-1
  • Type

    conf

  • DOI
    10.1109/IPDPS.2013.20
  • Filename
    6569903