DocumentCode
167615
Title
A Task Scheduling Algorithm Based on Replication for Maximizing Reliability on Heterogeneous Computing Systems
Author
Shuli Wang ; Kenli Li ; Jing Mei ; Keqin Li ; Yan Wang
Author_Institution
Coll. of Inf. Sci. & Eng., Hunan Univ., Changsha, China
fYear
2014
fDate
19-23 May 2014
Firstpage
1562
Lastpage
1571
Abstract
Over the past several years, a heterogeneous computing (HC) system has become more competative as a commercial computing platform than a homogeneous system. With the growing scale of HC systems, network failures become inevitable. To achieve high performance, communication reliability should be considered while designing reliability-aware task scheduling algorithms. In this paper, we propose a new algorithm called RMSR (Replication-based scheduling for Maximizing System Reliability), which incorporates task communication into system reliability. To maximize communication reliability, an improved algorithm which searches all optimal reliability communication paths for current tasks is proposed. During the task replication phase, the task reliability threshold is determined by users and each task has dynamic replicas. Our comparative studies based on randomly generated graphs show that our RMSR algorithm outperforms existing scheduling algorithms in terms of system reliability. Several factors affecting the performance are analyzed in the paper.
Keywords
graph theory; optimisation; redundancy; telecommunication network reliability; RMSR algorithm; communication reliability; heterogeneous computing system; network failure; optimal reliability communication path; randomly generated graph; reliability-aware task scheduling algorithm; replication-based scheduling for maximizing system reliability; task reliability threshold; task replication phase; Algorithm design and analysis; Computational modeling; Equations; Mathematical model; Program processors; Reliability; Scheduling algorithms; Directed acyclic graph; Heterogeneous computing systems; Reliability-aware scheduling; Replication-based algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location
Phoenix, AZ
Print_ISBN
978-1-4799-4117-9
Type
conf
DOI
10.1109/IPDPSW.2014.175
Filename
6969562
Link To Document