Title :
Fault-tolerant adaptive and minimal routing in mesh-connected multicomputers using extended safety levels
Author_Institution :
Dept. of Comput. Sci. & Eng., Florida Atlantic Univ., Boca Raton, FL, USA
Abstract :
The minimal routing problem in mesh-connected multicomputers with faulty blocks is studied. Two dimensional (2D) meshes are used to illustrate the approach. A sufficient condition for minimal routing in 2D meshes with faulty blocks is proposed. Unlike many other models that assume all the nodes know global fault distribution, our approach is based on the concept of an extended safety level which is a special form of limited fault information. Fault information is distributed to a limited number of nodes while it is still sufficient to support minimal routing. We study the existence of minimal paths at a given source node, limited distribution of fault information, and minimal routing itself. The proposed approach is also adaptive which allows all messages to use any minimal path. Our approach is the first attempt to address adaptive and minimal routing in 2D meshes with faulty blocks using limited fault information
Keywords :
fault tolerant computing; message passing; multiprocessing systems; multiprocessor interconnection networks; network routing; parallel architectures; 2D meshes; extended safety levels; fault-tolerant adaptive routing; faulty blocks; global fault distribution; mesh-connected multicomputers; minimal paths; minimal routing; two dimensional meshes; Communication networks; Communication switching; Computer science; Costs; Fault tolerance; Hypercubes; Network topology; Routing protocols; Safety; Telecommunication network reliability;
Conference_Titel :
Distributed Computing Systems, 1998. Proceedings. 18th International Conference on
Conference_Location :
Amsterdam
Print_ISBN :
0-8186-8292-2
DOI :
10.1109/ICDCS.1998.679771