Title :
Efficient diagnosis in algorithm-based fault tolerant multiprocessor systems
Author :
Srinivasan, Santhanam ; Jha, Niraj K.
Author_Institution :
Dept. of Electr. Eng., Princeton Univ., NJ, USA
Abstract :
The conventional assumption that checks in an algorithm-based fault tolerant (ABFT) system can be invalidated due to aliasing of erroneous data elements has complicated the task of error detection and location. In this paper we show that aliasing is a very rare occurrence. We then present a simple polynomial-time diagnosis algorithm which takes advantage of this result to run much more efficiently compared to the conventional method of diagnosis. We introduce the concept of NC-detectability and NC-locatability to measure the fault tolerance of the system when check invalidation does not occur and show how to design systems with specified error detectability/NC-detectability and locatability/NC-locatability. For the data-check graphs designed using these methods, when aliasing does not occur, our diagnosis algorithm has a worst case complexity of O(s2n2log n), where s is the error locatability and n is the number of data elements in the system. We also consider the case where the processors which compute the checks themselves fail
Keywords :
computational complexity; computer testing; fault diagnosis; fault tolerant computing; multiprocessing systems; parallel algorithms; reliability; NC-detectability; NC-locatability; algorithm-based fault tolerant multiprocessor systems; aliasing; complexity; data-check graphs; erroneous data elements; error locatability; polynomial-time diagnosis algorithm; Algorithm design and analysis; Design methodology; Electrical fault detection; Fault detection; Fault diagnosis; Fault tolerance; Fault tolerant systems; Multiprocessing systems; Polynomials; Terminology;
Conference_Titel :
Computer Design: VLSI in Computers and Processors, 1993. ICCD '93. Proceedings., 1993 IEEE International Conference on
Conference_Location :
Cambridge, MA
Print_ISBN :
0-8186-4230-0
DOI :
10.1109/ICCD.1993.393308