Title :
Hierarchical design and analysis of fault-tolerant multiprocessor systems using concurrent error detection
Author :
Nair, V.S.S. ; Abraham, J.A.
Author_Institution :
Center for Reliable and High Performance Comput., Illinois Univ., Urbana, IL, USA
Abstract :
A composition technique for building large fault-tolerant systems hierarchically using the concept of checks at different levels in the hierarchy is described. A small system of known fault detectability and locatability is replicated several times, and new checks are added at the next higher level. Such checks at different levels can be introduced into most of the existing multiprocessor systems. An analysis technique based on a matrix model is developed. Relationships between the fault detectability and locatability of a basic system are derived, and the corresponding values of the complete system are computed hierarchically. Finally, the techniques are extended to complex systems in which individual processors produce multiple sets of data elements.<>
Keywords :
error detection; fault tolerant computing; multiprocessing systems; complex systems; composition technique; concurrent error detection; data elements; fault detectability; fault-tolerant multiprocessor systems; hierarchical design; locatability; matrix model; Algorithm design and analysis; Computer errors; Concurrent computing; Contracts; Design engineering; Fault detection; Fault diagnosis; Fault tolerant systems; Multiprocessing systems; Reliability engineering;
Conference_Titel :
Fault-Tolerant Computing, 1990. FTCS-20. Digest of Papers., 20th International Symposium
Conference_Location :
Newcastle Upon Tyne, UK
Print_ISBN :
0-8186-2051-X
DOI :
10.1109/FTCS.1990.89348