DocumentCode
2850247
Title
A distribution aware scheduling method in MapReduce
Author
Zhang, Xiaohong ; Ding, Yang
Author_Institution
Sch. of Comput. Sci. & Technol., Henan Polytech. Univ., Jiaozuo, China
fYear
2012
fDate
24-27 June 2012
Firstpage
128
Lastpage
131
Abstract
Data locality is one of the critical factors which affect the system performance. In this paper, we focus on the data locality problem in Hadoop MapReduce. To improve the data locality of MapReduce, we propose a scheduling method. After receiving a request from a node, the method selects a task from the first level followed by the second and the third level of the node. Then, it checks whether the task is the only one on the first level of the node to issue a request. If so, the method skips the selected task, and selects another task for the node issuing a request. Otherwise, the method schedules the selected task to the node. We have analyzed the method. Comparing with default scheduling method of Hadoop MapReduce, the proposed method can improve the efficiency of data locality.
Keywords
Internet; data handling; public domain software; scheduling; software performance evaluation; Hadoop MapReduce; Internet technologies; data locality problem; distribution aware scheduling method; system performance; Educational institutions; Nonhomogeneous media; Data intesnsive applications; Data locality; MapReduce; Scheduling;
fLanguage
English
Publisher
ieee
Conference_Titel
Electrical & Electronics Engineering (EEESYM), 2012 IEEE Symposium on
Conference_Location
Kuala Lumpur
Print_ISBN
978-1-4673-2363-5
Type
conf
DOI
10.1109/EEESym.2012.6258605
Filename
6258605
Link To Document