DocumentCode
3599907
Title
MARS: Scheduling non-local tasks in mapreduce
Author
Mingxing Tang ; Changjian Wang ; Yuxing Peng
Author_Institution
Coll. of Comput., Nat. Univ. of Defense Technol., Changsha, China
fYear
2014
Firstpage
536
Lastpage
540
Abstract
Data locality is one of most important principles in MapReduce and lots of efforts have been devoted to it. However, there often exist some tasks in MapReduce, named non-local tasks, which need access remote data. The existing scheduler in MapReduce provides a simple strategy to schedule the non-local tasks, it not only takes no optimization into account but also may result in more non-local tasks. To address these problems, we design a new non-local task scheduling approach for MapReduce, named MARS. A task selection algorithm is proposed to choose a proper non-local tasks from multiple candidate tasks before scheduling and an overlapped schedule algorithm is proposed to optimize the time for a task to access remote data. Based on the above work, a new scheduling mechanism for non-local tasks is designed and implemented in MapReduce. Comprehensive experiments have been performed to verify the effectiveness of MARS. The results show that MARS can reduce Map phase runtime by 25% and achieve a better data locality than native Hadoop.
Keywords
data handling; feature selection; optimisation; parallel processing; scheduling; MARS; MapReduce; data locality; nonlocal task scheduling; task selection algorithm; time optimization; Mars; Prefetching; Scheduling algorithms; MARS; MapReduce; Non-local tasks; overlap;
fLanguage
English
Publisher
ieee
Conference_Titel
Cloud Computing and Intelligence Systems (CCIS), 2014 IEEE 3rd International Conference on
Print_ISBN
978-1-4799-4720-1
Type
conf
DOI
10.1109/CCIS.2014.7175794
Filename
7175794
Link To Document