DocumentCode :
236537
Title :
Locality and Network-Aware Reduce Task Scheduling for Data-Intensive Applications
Author :
Arslan, Engin ; Shekhar, Mrigank ; Kosar, Tevfik
Author_Institution :
Univ. at Buffalo, Buffalo, NY, USA
fYear :
2014
fDate :
21-21 Nov. 2014
Firstpage :
17
Lastpage :
24
Abstract :
MapReduce is one of the leading programming frameworks to implement data-intensive applications by splitting the map and reduce tasks to distributed servers. Although there has been substantial amount of work on map task scheduling and optimization in the literature, the work on reduce task scheduling is very limited. Effective scheduling of the reduce tasks to the resources becomes especially important for the performance of data-intensive applications where large amounts of data are moved between the map and reduce tasks. In this paper, we propose a new algorithm (LoNARS) for reduce task scheduling, which takes both data locality and network traffic into consideration. Data locality awareness aims to schedule the reduce tasks closer to the map tasks to decrease the delay in data access as well as the amount of traffic pushed to the network. Network traffic awareness intends to distribute the traffic over the whole network and minimize the hotspots to reduce the effect of network congestion in data transfers. We have integrated LoNARS into Hadoop-1.2.1. Using our LoNARS algorithm, we achieved up to 15% gain in data shuffling time and up to 3-4% improvement in total job completion time compared to the other reduce task scheduling algorithms. Moreover, we reduced the amount of traffic on network switches by 15% which helps to save energy consumption considerably.
Keywords :
data handling; distributed processing; optimisation; Data locality awareness; data access; data intensive applications; distributed servers; locality aware reduce task scheduling; map task scheduling; network aware reduce task scheduling; optimization; Benchmark testing; Cost function; Heart beat; Scheduling; Scheduling algorithms; Servers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data-Intensive Computing in the Clouds (DataCloud), 2014 5th International Workshop on
Conference_Location :
New Orleans, LA
Type :
conf
DOI :
10.1109/DataCloud.2014.10
Filename :
7017949
Link To Document :
بازگشت