Title :
A Novel Bicriteria Scheduling Heuristics Providing a Guaranteed Global System Failure Rate
Author :
Girault, Alain ; Kalla, Hamoudi
Author_Institution :
LIG Lab., Grenoble Univ., St. Ismier, France
Abstract :
We propose a new framework for the (length and reliability) bicriteria static multiprocessor scheduling problem. Our first criterion remains the schedule´s length, which is crucial to assess the system´s real-time property. For our second criterion, we consider the global system failure rate, seen as if the whole system were a single task scheduled onto a single processor, instead of the usual reliability, because it does not depend on the schedule length like the reliability does (due to its computation in the classical exponential distribution model). Therefore, we control better the replication factor of each individual task of the dependency task graph given as a specification, with respect to the desired failure rate. To solve this bicriteria optimization problem, we take the failure rate as a constraint, and we minimize the schedule length. We are thus able to produce, for a given dependency task graph and multiprocessor architecture, a Pareto curve of nondominated solutions, among which the user can choose the compromise that fits his or her requirements best. Compared to the other bicriteria (length and reliability) scheduling algorithms found in the literature, the algorithm we present here is the first able to improve significantly the reliability, by several orders of magnitude, making it suitable to safety-critical systems.
Keywords :
Pareto optimisation; exponential distribution; graph theory; minimisation; multiprocessing systems; processor scheduling; program diagnostics; real-time systems; safety-critical software; system recovery; Pareto optimal curve; bicriteria optimization problem; bicriteria static multiprocessor heuristic scheduling algorithm; classical exponential distribution model; dependency task graph; global system failure rate; multiprocessor architecture; real-time system property; reliability block diagram; replication factor; safety-critical system; schedule length minimization framework; single task scheduling; Constraint optimization; Distributed computing; Embedded system; Exponential distribution; Hardware; Memory architecture; Pareto optimization; Processor scheduling; Real time systems; Scheduling algorithm; Fault tolerance; Pareto optima; Real-time and embedded systems; Reliability; Testing; and Fault-Tolerance; and serviceability; availability; bicriteria optimization; reliability block diagrams; safety-critical systems.; static multiprocessor scheduling;
Journal_Title :
Dependable and Secure Computing, IEEE Transactions on
DOI :
10.1109/TDSC.2008.50