DocumentCode :
1484640
Title :
QoS-Aware Fault-Tolerant Scheduling for Real-Time Tasks on Heterogeneous Clusters
Author :
Zhu, Xiaomin ; Qin, Xiao ; Qiu, Meikang
Author_Institution :
Sci. & Technol. on Inf. Syst. Eng. Lab., Nat. Univ. of Defense Technol., Changsha, China
Volume :
60
Issue :
6
fYear :
2011
fDate :
6/1/2011 12:00:00 AM
Firstpage :
800
Lastpage :
812
Abstract :
Fault-tolerant scheduling plays a significant role in improving system reliability of clusters. Although extensive fault-tolerant scheduling algorithms have been proposed for real-time tasks in parallel and distributed systems, quality of service (QoS) requirements of tasks have not been taken into account. This paper presents a fault-tolerant scheduling algorithm called QAFT that can tolerate one node´s permanent failures at one time instant for real-time tasks with QoS needs on heterogeneous clusters. In order to improve system flexibility, reliability, schedulability, and resource utilization, QAFT strives to either advance the start time of primary copies and delay the start time of backup copies in order to help backup copies adopt the passive execution scheme, or to decrease the simultaneous execution time of the primary and backup copies of a task as much as possible to improve resource utilization. QAFT is capable of adaptively adjusting the QoS levels of tasks and the execution schemes of backup copies to attain high system flexibility. Furthermore, we employ the overlapping technology of backup copies. The latest start time of backup copies and their constraints are analyzed and discussed. We conduct extensive experiments to compare our QAFT with two existing schemes-NOQAFT and DYFARS. Experimental results show that QAFT significantly improves the scheduling quality of NOQAFT and DYFARS.
Keywords :
fault tolerant computing; quality of service; scheduling; QoS-aware fault-tolerant scheduling; distributed systems; heterogeneous clusters; parallel systems; passive execution scheme; quality of service; realtime task scheduling; Fault tolerance; Fault tolerant systems; Heuristic algorithms; Quality of service; Real time systems; Scheduling algorithm; Heterogeneous clusters; fault tolerance; heuristic.; quality of service (QoS); real-time; scheduling;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.2011.68
Filename :
5740856
Link To Document :
بازگشت