DocumentCode :
1275391
Title :
Fault-tolerant communication with partitioned dimension-order routers
Author :
Boppana, Rajendra V. ; Chalasani, Suresh
Author_Institution :
Div. of Comput. Sci., Texas Univ., San Antonio, TX, USA
Volume :
10
Issue :
10
fYear :
1999
fDate :
10/1/1999 12:00:00 AM
Firstpage :
1026
Lastpage :
1039
Abstract :
The current fault-tolerant routing methods require extensive changes to practical routers such as the Cray T3D´s dimension-order router to handle faults. In this paper, we propose methods to handle faults in multicomputers with dimension-order routers with simple changes to router structure and logic. Our techniques can be applied to current implementations in which the router is partitioned into multiple modules and no centralized crossbar is used. We consider arbitrarily located faulty blocks and assume only local knowledge of faults. We apply our techniques for torus networks and show that, with as few as four virtual channels per physical channel, deadlock- and livelock-free routing can be provided even with multiple faults and multimodule implementation of routers. Our simulations of the proposed technique for 2D tori and mesh indicate that the performance degradation is similar to that seen in the case of cross-bar based designs previously proposed
Keywords :
fault tolerant computing; multiprocessor interconnection networks; performance evaluation; telecommunication network routing; 2D tori; Cray T3D; arbitrarily located faulty blocks; deadlock-free routing; dimension-order router; dimension-order routers; fault-tolerant communication; livelock-free routing; multicomputers; partitioned dimension-order routers; router structure; simulations; torus networks; virtual channels; Communication switching; Degradation; Fault tolerance; Logic; Network topology; Pins; Rivers; Routing; System recovery; Throughput;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/71.808144
Filename :
808144
Link To Document :
بازگشت