DocumentCode :
2194261
Title :
BAR: An Efficient Data Locality Driven Task Scheduling Algorithm for Cloud Computing
Author :
Jin, Jiahui ; Luo, Junzhou ; Song, Aibo ; Dong, Fang ; Xiong, Runqun
Author_Institution :
Sch. of Comput. Sci. & Eng., Southeast Univ., Nanjing, China
fYear :
2011
fDate :
23-26 May 2011
Firstpage :
295
Lastpage :
304
Abstract :
Large scale data processing is increasingly common in cloud computing systems like MapReduce, Hadoop, and Dryad in recent years. In these systems, files are split into many small blocks and all blocks are replicated over several servers. To process files efficiently, each job is divided into many tasks and each task is allocated to a server to deals with a file block. Because network bandwidth is a scarce resource in these systems, enhancing task data locality(placing tasks on servers that contain their input blocks) is crucial for the job completion time. Although there have been many approaches on improving data locality, most of them either are greedy and ignore global optimization, or suffer from high computation complexity. To address these problems, we propose a heuristic task scheduling algorithm called Balance-Reduce(BAR), in which an initial task allocation will be produced at first, then the job completion time can be reduced gradually by tuning the initial task allocation. By taking a global view, BAR can adjust data locality dynamically according to network state and cluster workload. The simulation results show that BAR is able to deal with large problem instances in a few seconds and outperforms previous related algorithms in term of the job completion time.
Keywords :
cloud computing; computational complexity; file servers; resource allocation; BAR; Dryad system; Hadoop system; MapReduce system; balance-reduce algorithm; cloud computing; computational complexity; data locality driven task scheduling algorithm; files servers; job completion time; large scale data processing; scarce resource network bandwidth; task allocation; Cloud computing; Clustering algorithms; Heuristic algorithms; Processor scheduling; Resource management; Scheduling; Servers; Cloud Computing; Data Locality; Dryad; Hadoop; Task Scheduling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM International Symposium on
Conference_Location :
Newport Beach, CA
Print_ISBN :
978-1-4577-0129-0
Electronic_ISBN :
978-0-7695-4395-6
Type :
conf
DOI :
10.1109/CCGrid.2011.55
Filename :
5948620
Link To Document :
بازگشت