Title :
Workload Characteristic Oriented Scheduler for MapReduce
Author :
Peng Lu ; Young Choon Lee ; Chen Wang ; Bing Bing Zhou ; Junliang Chen ; Zomaya, Albert Y.
Author_Institution :
Center for Distrib. & High Performance Comput., Univ. of Sydney, Sydney, NSW, Australia
Abstract :
Applications in many areas are increasingly developed and ported using the Map Reduce framework (more specifically, Hadoop) to exploit (data) parallelism. The application scope of Map Reduce has been extended beyond the original design goal which was large-scale data processing. This extension inherently makes a need for scheduler to explicitly take into account characteristics of job for two main goals of efficient resource use and performance improvement. In this paper, we study Map Reduce scheduling strategies to effectively deal with different workload characteristics CPU intensive and I/O intensive. We present the Workload Characteristic Oriented Scheduler (WCO), which strives for co-locating tasks of possibly different Map Reduce jobs with complementing resource usage characteristics. WCO is characterized by its essentially dynamic and adaptive scheduling decisions using information obtained from its characteristic estimator. Workload characteristics of tasks are primarily estimated by sampling with the help of some static task selection strategies, e.g., Java byte code analysis. Results obtained from extensive experiments using 11 benchmarks in a 4-node local cluster and a 51-node Amazon EC2 cluster show 17% performance improvement on average in terms of throughput in the situation of co-existing diverse workloads.
Keywords :
distributed processing; scheduling; I/O intensive; Map Reduce framework; WCO; adaptive scheduling; distributed computing; dynamic scheduling; large-scale data processing; performance improvement; workload characteristic oriented scheduler; Benchmark testing; Dynamic scheduling; Estimation; Heart beat; Resource management; Throughput; Hadoop; MapReduce Scheduling; Static Program Analysis; Workload Co-location;
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4673-4565-1
Electronic_ISBN :
1521-9097
DOI :
10.1109/ICPADS.2012.31