Title :
A kernel for multi-level fault-tolerant multiprocessing
Author :
Cuyvers, Rudi ; Lauwereins, Rudy ; Peperstraete, J.A.
Author_Institution :
ESAT Lab., Katholieke Univ. Leuven, Heverlee, Belgium
Abstract :
A run-time kernel is presented which permits one to choose the number of cold, warm, hot, and active backups for each task of a parallel program or different parallel program. In this way multiple processes with different levels of hardware fault-tolerance can run concurrently on the same multiprocessor, leading to cost-effective fault-tolerant multiprocessing. The kernel consists of two main modules. The network module realizes fault-tolerant internode routing. The node module offers the above-mentioned multilevel node fault-tolerance. It is kept simple by implementing the argument flow program organization, a combination of classical control flow and dynamic data flow. The kernel is implemented on a transputer system. Software fault-tolerance can easily be provided
Keywords :
fault tolerant computing; multiprocessing programs; multiprocessing systems; network operating systems; active backups; argument flow program organization; dynamic data flow; fault-tolerant internode routing; hardware fault-tolerance; message passing; multi-level fault-tolerant multiprocessing; multilevel node fault-tolerance; network module; node module; run-time kernel; transputer system; Assembly; Centralized control; Communication system control; Fault tolerance; Hardware; Kernel; Message passing; Physical layer; Protocols; Routing;
Conference_Titel :
Southeastcon '91., IEEE Proceedings of
Conference_Location :
Williamsburg, VA
Print_ISBN :
0-7803-0033-5
DOI :
10.1109/SECON.1991.147747