DocumentCode :
349219
Title :
An efficient algorithm-based fault detection and recovery on multiprocessor systems
Author :
Ali, Samia A. ; Mahdy, Yozrsef B. ; Hassan, Hassan A.
Author_Institution :
Dept. of Electr. Eng., Assiut Univ., Egypt
Volume :
2
fYear :
1999
fDate :
5-8 Sep 1999
Firstpage :
1093
Abstract :
Algorithm-Based Fault Tolerance (ABFT) schemes have been proposed as a means of low-cost error protection for parallel algorithms. This paper presents a modified fault tolerant scheme for matrix multiplication on multiprocessor systems. The proposed scheme increases the detectability through the use of a new partition scheme for the system´s processors. The time overhead of the modified recovery algorithm is reduced by the use of a new weight checksum code based only on shifting not multiplication. In this paper a Triple modular Redundancy (TMR) host is used which is actually a part of the multiprocessor system to avoid the need for an expensive host. Thus, the proposed system possess higher reliability at a lower overhead time and cost
Keywords :
error correction; error detection; fault tolerant computing; matrix multiplication; multiprocessing systems; parallel algorithms; redundancy; reliability; system recovery; algorithm-based fault tolerance; efficient algorithm-based fault detection; low-cost error protection; matrix multiplication; modified fault tolerant scheme; modified recovery algorithm; multiprocessor systems; overhead cost; overhead time; parallel algorithms; partition scheme; recovery; reliability; shifting; time overhead; triple modular redundancy; Costs; Electrical fault detection; Error correction; Fault detection; Fault tolerant systems; Multiprocessing systems; Parallel algorithms; Partitioning algorithms; Protection; Redundancy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electronics, Circuits and Systems, 1999. Proceedings of ICECS '99. The 6th IEEE International Conference on
Conference_Location :
Pafos
Print_ISBN :
0-7803-5682-9
Type :
conf
DOI :
10.1109/ICECS.1999.813424
Filename :
813424
Link To Document :
بازگشت