DocumentCode :
3354848
Title :
Efficient utilization of spare capacity for fault detection and location in multiprocessor systems
Author :
Tridandapani, S. ; Somani, A.K.
Author_Institution :
Washington Univ., Seattle, WA, USA
fYear :
1992
fDate :
8-10 July 1992
Firstpage :
440
Lastpage :
447
Abstract :
One scheme for detecting faults at the processor level in a multiprocessor system (see A. Dahbura et al., 1989) works by running secondary versions of jobs on the unused, or spare, processors of the system. The authors build upon this scheme and propose three new multiprocessor allocation strategies that run a viable number of versions per job. These schemes permit online detection and, in some cases, location of faulty processors in a system, without degrading its delay/throughput performance. Two new metrics, the fault detection capability and the fault location capability, are introduced to evaluate these schemes. Extensive simulation results are provided to show that these schemes utilize spare capacity more efficiently, thereby improving upon the fault detection and location capabilities of the system.<>
Keywords :
computer testing; fault location; fault tolerant computing; multiprocessing systems; resource allocation; fault detection; fault detection capability; fault location; fault location capability; multiprocessor allocation strategies; multiprocessor system; multiprocessor systems; spare capacity; Acceleration; Computational modeling; Computer science; Degradation; Electrical fault detection; Fault detection; Fault location; Hardware; Multiprocessing systems; Throughput;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fault-Tolerant Computing, 1992. FTCS-22. Digest of Papers., Twenty-Second International Symposium on
Conference_Location :
Boston, MA, USA
Print_ISBN :
0-8186-2875-8
Type :
conf
DOI :
10.1109/FTCS.1992.243591
Filename :
243591
Link To Document :
بازگشت