DocumentCode
1484640
Title
QoS-Aware Fault-Tolerant Scheduling for Real-Time Tasks on Heterogeneous Clusters
Author
Zhu, Xiaomin ; Qin, Xiao ; Qiu, Meikang
Author_Institution
Sci. & Technol. on Inf. Syst. Eng. Lab., Nat. Univ. of Defense Technol., Changsha, China
Volume
60
Issue
6
fYear
2011
fDate
6/1/2011 12:00:00 AM
Firstpage
800
Lastpage
812
Abstract
Fault-tolerant scheduling plays a significant role in improving system reliability of clusters. Although extensive fault-tolerant scheduling algorithms have been proposed for real-time tasks in parallel and distributed systems, quality of service (QoS) requirements of tasks have not been taken into account. This paper presents a fault-tolerant scheduling algorithm called QAFT that can tolerate one node´s permanent failures at one time instant for real-time tasks with QoS needs on heterogeneous clusters. In order to improve system flexibility, reliability, schedulability, and resource utilization, QAFT strives to either advance the start time of primary copies and delay the start time of backup copies in order to help backup copies adopt the passive execution scheme, or to decrease the simultaneous execution time of the primary and backup copies of a task as much as possible to improve resource utilization. QAFT is capable of adaptively adjusting the QoS levels of tasks and the execution schemes of backup copies to attain high system flexibility. Furthermore, we employ the overlapping technology of backup copies. The latest start time of backup copies and their constraints are analyzed and discussed. We conduct extensive experiments to compare our QAFT with two existing schemes-NOQAFT and DYFARS. Experimental results show that QAFT significantly improves the scheduling quality of NOQAFT and DYFARS.
Keywords
fault tolerant computing; quality of service; scheduling; QoS-aware fault-tolerant scheduling; distributed systems; heterogeneous clusters; parallel systems; passive execution scheme; quality of service; realtime task scheduling; Fault tolerance; Fault tolerant systems; Heuristic algorithms; Quality of service; Real time systems; Scheduling algorithm; Heterogeneous clusters; fault tolerance; heuristic.; quality of service (QoS); real-time; scheduling;
fLanguage
English
Journal_Title
Computers, IEEE Transactions on
Publisher
ieee
ISSN
0018-9340
Type
jour
DOI
10.1109/TC.2011.68
Filename
5740856
Link To Document