TaskTracker Aware Scheduling for Hadoop MapReduce

Author

Manjaly, Jisha S. ; Chooralil, Varghese S.

Author_Institution

Dept. of Comput. Sci. & Eng., Mahatma Gandhi Univ., Kochi, India

fYear

2013

fDate

29-31 Aug. 2013

Firstpage

278

Lastpage

281

Abstract

Hadoop is a framework for processing large amount of data in parallel with the help of Hadoop Distributed File System (HDFS) and MapReduce framework. Job scheduling is an important process in Hadoop MapReduce. Hadoop comes with three types of schedulers namely FIFO, Fair and Capacity Scheduler. The schedulers are now a plug gable component in the Hadoop MapReduce framework. When jobs have a dependency on an external service like database or Web service may leads to the failure of tasks due to overloading. In this scenario, Hadoop needs to re-run the tasks in another slots. To address this issue, Task Tracker aware scheduling has introduced. This scheduler enables users to configure a maximum load per Task Tracker in the Job Configuration itself. The algorithm will not allow a task to run and fail if the load of the Task Tracker reaches its threshold for the job. Also this scheduler allows the users to select the Task Tracker´s per Job in the Job configuration.

Keywords

distributed databases; parallel programming; scheduling; Capacity scheduler; FIFO scheduler; Fair scheduler; HDFS; Hadoop Distributed File System; Hadoop MapReduce framework; TaskTracker aware scheduling; Web service; database management; external service; job configuration; job scheduling; large-data processing; pluggable component; Distributed databases; Educational institutions; Handover; Heart beat; Processor scheduling; BigData; HDFS; Hadoop; JobTracker; MapReduce; Scheduler; TaskTracker;

fLanguage

English

Publisher

ieee

Conference_Titel

Advances in Computing and Communications (ICACC), 2013 Third International Conference on

Conference_Location

Cochin

Type

conf

DOI

10.1109/ICACC.2013.103

Filename

6686388