Title :
Node covering, error correcting codes and multiprocessors with very high average fault tolerance
Author :
Dutt, S. ; Mahapatra, N.R.
Author_Institution :
Dept. of Electr. Eng., Minnesota Univ., Minneapolis, MN, USA
Abstract :
Most previous work on fault-tolerant (FT) multiprocessor design has concentrated on deterministic k-fault-tolerant (k-FT) designs in which exactly k spare processors and some spare switches and links are added to construct multiprocessors that can tolerate any k processor faults. However, after k faults are reconfigured around, much of the extra links and switches can remain unutilized. We show how to use the node-covering principle of Dutt and Hayes (1992) and error correcting codes in order to construct probabilistic designs with very high average fault tolerance but low wiring and switch overhead. This design methodology is applicable to any multiprocessor interconnection topology. We also obtain the deterministic fault tolerance for these designs and develop efficient layout strategies for them.<>
Keywords :
error correction codes; fault tolerant computing; multiprocessor interconnection networks; reconfigurable architectures; reliability; switches; switching circuits; deterministic fault tolerance; deterministic k-fault-tolerant designs; efficient layout strategies; error correcting codes; exactly k spare processors; fault-tolerant multiprocessor design; low switch overhead; low wiring overhead; multiprocessor interconnection topology; multiprocessors; node covering; probabilistic designs; processor faults; spare links; spare switches; very high average fault tolerance; Degradation; Design methodology; Error correction codes; Fault tolerance; Hardware; Multiprocessor interconnection; Network topology; Switches; Very large scale integration; Wiring;
Conference_Titel :
Fault-Tolerant Computing, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International Symposium on
Conference_Location :
Pasadena, CA, USA
Print_ISBN :
0-8186-7079-7
DOI :
10.1109/FTCS.1995.466967