DocumentCode
604081
Title
Skew-Aware Task Scheduling in Clouds
Author
Dongsheng Li ; Yixing Chen ; Hai, R.H.
Author_Institution
Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Technol., Changsha, China
fYear
2013
fDate
25-28 March 2013
Firstpage
341
Lastpage
346
Abstract
Data skew is an important reason for the emergence of stragglers in MapReduce-like cloud systems. In this paper, we propose a Skew-Aware Task Scheduling (SATS) mechanism for iterative applications in MapReduce-like systems. The mechanism utilizes the similarity of data distribution in adjacent iterations of iterative applications to reduce the straggle problem caused by data skew. It collects the data distribution information during the execution of tasks for the current iteration, and uses the information to guide data partitioning in tasks for the next iteration. We implement the mechanism in the HaLoop system and deploy it in a cluster. Experiments show that the proposed mechanism could deal with the data skew and improve the load balancing effectively.
Keywords
cloud computing; iterative methods; resource allocation; scheduling; task analysis; HaLoop system; MapReduce-like cloud systems; MapReduce-like systems; SATS mechanism; data distribution information; data partitioning; data skew; iterative applications; load balancing; skew-aware task scheduling; Computational modeling; Data models; Data structures; Distributed databases; File systems; Load management; Processor scheduling; Cloud; Data Skew; Load balancing; Task Scheduling;
fLanguage
English
Publisher
ieee
Conference_Titel
Service Oriented System Engineering (SOSE), 2013 IEEE 7th International Symposium on
Conference_Location
Redwood City
Print_ISBN
978-1-4673-5659-6
Type
conf
DOI
10.1109/SOSE.2013.64
Filename
6525543
Link To Document