Title :
Cura: A Cost-Optimized Model for MapReduce in a Cloud
Author :
Palanisamy, Balaji ; Singh, Ashutosh ; Langston, Bryan
Author_Institution :
Coll. of Comput., Georgia Tech, Atlanta, GA, USA
Abstract :
We propose a new MapReduce cloud service model, Cura, for data analytics in the cloud. We argue that performing MapReduce analytics in existing cloud service models - either using a generic compute cloud or a dedicated MapReduce cloud - is inadequate and inefficient for production workloads. Existing services require users to select a number of complex cluster and job parameters while simultaneously forcing the cloud provider to use those potentially sub-optimal configurations resulting in poor resource utilization and higher cost. In contrast Cura leverages MapReduce profiling to automatically create the best cluster configuration for the jobs so as to obtain a global resource optimization from the provider perspective. Secondly, to better serve modern MapReduce workloads which constitute a large proportion of interactive real-time jobs, Cura uses a unique instant VM allocation technique that reduces response times by up to 65%. Thirdly, our system introduces deadline-awareness which, by delaying execution of certain jobs, allows the cloud provider to optimize its global resource allocation and reduce costs further. Cura also benefits from a number of additional performance enhancements including cost-aware resource provisioning, VMaware scheduling and online virtual machine reconfiguration. Our experimental results using Facebook-like workload traces show that along with response time improvements, our techniques lead to more than 80% reduction in the compute infrastructure cost of the cloud data center.
Keywords :
cloud computing; computer centres; resource allocation; software cost estimation; virtual machines; Cura; Facebook-like workload; MapReduce analytics; MapReduce cloud service model; MapReduce profiling; MapReduce workload; VM allocation technique; VMaware scheduling; cloud data center; cloud provider; cluster configuration; complex cluster; cost reduction; cost-aware resource provisioning; cost-optimized model; data analytics; deadline-awareness; dedicated MapReduce cloud; generic compute cloud; global resource allocation; global resource optimization; interactive real-time job; job execution; job parameter; online virtual machine reconfiguration; production workload; resource utilization; suboptimal configuration; Adaptation models; Computational modeling; Optimization; Production; Resource management; Schedules; Time factors;
Conference_Titel :
Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4673-6066-1
DOI :
10.1109/IPDPS.2013.20