DocumentCode :
1665508
Title :
A Study of Data Locality in YARN
Author :
Elshater, Yehia ; Martin, Patrick ; Rope, Dan ; McRoberts, Mike ; Statchuk, Craig
Author_Institution :
Sch. of Comput., Queen´s Univ., Kingston, ON, Canada
fYear :
2015
Firstpage :
174
Lastpage :
181
Abstract :
Co-locating the computation as close as possible to the data is an important consideration in the current data intensive systems. This is known as data locality problem. In this paper, we analyze the impact of data locality on YARN, which is the new version of Hadoop. We investigate YARN delay scheduler behavior with respect to data locality for a variety of workloads and configurations. We address in this paper three problems related to data locality. First, we study the trade-off between the data locality and the job completion time. Secondly, we observe that there is an imbalance of resource allocation when considering the data locality, which may under-utilize the cluster. Thirdly, we address the redundant I/O operations when different YARN containers request input data blocks on the same node. Additionally, we propose YARN Locality Simulator (YLocSim), a simulator tool that simulates the interactions between YARN components in a real cluster and reports the data locality percentages in real time. We validate YLocSim over a real cluster setup and use it in our study.
Keywords :
data handling; digital simulation; input-output programs; parallel processing; resource allocation; scheduling; Hadoop; I/O operation; YARN delay scheduler behavior; YARN locality simulator tool; YLocSim; data intensive system; data locality; resource allocation; Bandwidth; Benchmark testing; Containers; Delays; Resource management; Scheduling; Yarn; Data Locality; Hadoop; Scheduling; Simulation; YARN;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (BigData Congress), 2015 IEEE International Congress on
Conference_Location :
New York, NY
Print_ISBN :
978-1-4673-7277-0
Type :
conf
DOI :
10.1109/BigDataCongress.2015.33
Filename :
7207217
Link To Document :
بازگشت