DocumentCode
3783383
Title
Off-line real-time fault-tolerant scheduling
Author
C. Dima;A. Girault;C. Lavarenne;Y. Sorel
Author_Institution
INRIA, Montbonnot St. Martin, France
fYear
2001
fDate
6/23/1905 12:00:00 AM
Firstpage
410
Lastpage
417
Abstract
We address the problem of off-line fault tolerant scheduling of an algorithm onto a multiprocessor architecture with distributed memory and provide a generic algorithm which solves this problem. We take into account two kinds of failures: fail-silent and omission. The basic technique we use is the replication of operations and data communications. We then discuss the principles which govern the execution of schedulings with replication under the state-machine and the primary/backup arbitrations between replicas. We also show how to compute the execution date for each operation and the timeouts which are used for detecting failures. We end with a heuristic which, using this calculus, computes a possibly non optimal scheduling by finding plain schedulings for each failure pattern and then combining them into a scheduling with replication.
Keywords
"Fault tolerance","Processor scheduling","Scheduling algorithm","Fault tolerant systems","Protocols","Topology","Heuristic algorithms","Time factors","NP-complete problem","Hardware"
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing, 2001. Proceedings. Ninth Euromicro Workshop on
Print_ISBN
0-7695-0987-8
Type
conf
DOI
10.1109/EMPDP.2001.905069
Filename
905069
Link To Document