Title :
Fault-tolerant wormhole routing algorithms for mesh networks
Author :
Boppana, Rajendra V. ; Chalasani, Suresh
Author_Institution :
Div. of Comput. Sci., Texas Univ., San Antonio, TX, USA
fDate :
7/1/1995 12:00:00 AM
Abstract :
We present simple methods to enhance the current minimal wormhole routing algorithms developed for high radix, low dimensional mesh networks for fault tolerant routing. We consider arbitrarily located faulty blocks and assume only local knowledge of faults. Messages are routed minimally when not blocked by faults and this constraint is relaxed to route around faults. The key concept we use is a fault ring consisting of fault free nodes and links can be formed around each fault region. Our fault tolerant techniques use these fault rings to route messages around fault regions. We show that, using just one extra virtual channel per physical channel, the well known e cube algorithm can be used to provide deadlock free routing in networks with nonoverlapping fault rings; there is no restriction on the number of faults. For the more complex faults with overlapping fault rings, four virtual channels are used. We also prove that at most four additional virtual channels are sufficient to make fully adaptive algorithms tolerant to multiple faulty blocks in n dimensional meshes. All these algorithms are deadlock and livelock free. Further, we present simulation results for the e cube and a fully adaptive algorithm fortified with our fault tolerant routing techniques and show that good performance may be obtained with as many as 10% links faulty
Keywords :
adaptive systems; fault tolerant computing; message passing; multiprocessor interconnection networks; reliability; adaptive algorithms; arbitrarily located faulty blocks; deadlock free routing; e cube algorithm; fault free nodes; fault ring; fault tolerant wormhole routing algorithms; fault-tolerant wormhole routing algorithms; fully adaptive algorithm; local knowledge; low dimensional mesh networks; minimal wormhole routing algorithms; multicomputer networks; multiple faulty blocks; nonoverlapping fault rings; performance evaluation; virtual channel; virtual channels; Adaptive algorithm; Communication switching; Computer network reliability; Computer networks; Concurrent computing; Fault tolerance; Mesh networks; Routing; System recovery; Telecommunication network reliability;
Journal_Title :
Computers, IEEE Transactions on