DocumentCode
3349452
Title
Tolerating faults in a mesh with a row of spare nodes
Author
Bruck, Jehoshua ; Cypher, Robert ; Ho, Ching-Tien
Author_Institution
IBM Almaden Res. Center, San Jose, CA, USA
fYear
1992
fDate
1-4 Dec 1992
Firstpage
12
Lastpage
19
Abstract
The authors present an efficient method for tolerating faults in a two-dimensional mesh architecture. The approach is based on adding spare components (nodes) and extra links (edges) such that the resulting architecture can be reconfigured as a mesh in the presence of faults. The cost of the fault-tolerant mesh architecture is optimized by adding about one row of redundant nodes in addition to a set of k spare nodes (while tolerating up to k node faults) and minimizing the number of links per node. The results are surprisingly efficient and seem to be practical for small values of k . The degree of the fault-tolerant architecture is k +5 for odd k , and k +6 for even k . The results can be generalized to d -dimensional meshes such that the number of spare nodes is less than the length of the shortest axis plus k , and the degree of the fault-tolerant mesh is (d -1) k +d +3 when k is odd and (d -1)k +2d +2 when k is even
Keywords
fault tolerant computing; parallel architectures; fault tolerance; fault-tolerant architecture; fault-tolerant mesh; spare components; two-dimensional mesh architecture; Computer architecture; Cost function; Fabrication; Fault tolerance; Joining processes; Large scale integration; Microprocessors; Parallel machines; Redundancy; Switches;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing, 1992. Proceedings of the Fourth IEEE Symposium on
Conference_Location
Arlington, TX
Print_ISBN
0-8186-3200-3
Type
conf
DOI
10.1109/SPDP.1992.242768
Filename
242768
Link To Document