Title :
Performance Variations in Resource Scaling for MapReduce Applications on Private and Public Clouds
Author :
Fan Zhang ; Sakr, Majd
Author_Institution :
Massachusetts Inst. of Technol., Cambridge, MA, USA
fDate :
June 27 2014-July 2 2014
Abstract :
In this paper, we delineate the causes of performance variations when scaling provisioned virtual resources for a variety of MapReduce applications. Hadoop MapReduce facilitates the development and execution processes of large-scale batch applications on big data. However, provisioning suitable resources to achieve desired performance at an affordable cost requires expertise into the execution model of MapReduce, the resources available for provisioning and the execution behavior of the application at hand. As an initial step towards automating this process, we characterize the difference in execution response for different MapReduce applications while varying the number of virtualized CPUs and memory resources, number of map slots as well as cluster size on a private cloud. This characterization helps illustrate the performance variation, 5x compared to 36x speedup, of Reduce-intensive and Map-intensive applications at effectively utilizing provisioned resources at different scales (1-64 VMs). By comparing the scalability efficiency, we clearly indicate the under-provisioning or over-provisioning of resources for different MapReduce applications at large scale.
Keywords :
Big Data; cloud computing; parallel programming; performance evaluation; power aware computing; resource allocation; storage management; virtual machines; virtualisation; Hadoop MapReduce; MapReduce applications; VM; big data; cluster size; execution processes; large-scale batch applications; map slots; map-intensive applications; memory resources; performance variations; private cloud; provisioned virtual resource scaling; reduce-intensive applications; scalability efficiency; virtualized CPU; Benchmark testing; Blades; Cloud computing; Hardware; Processor scheduling; Random access memory; Cloud computing; MapReduce applications; dataset size; input scaling; parallel computing;
Conference_Titel :
Cloud Computing (CLOUD), 2014 IEEE 7th International Conference on
Conference_Location :
Anchorage, AK
Print_ISBN :
978-1-4799-5062-1
DOI :
10.1109/CLOUD.2014.68