Title :
Fault-tolerant wormhole switching with backtracking capability
Author :
Kitakami, Masato ; Sueishi, Manabu
Author_Institution :
Fac. of Eng., Chiba Univ., Japan
Abstract :
The message switching method, which controls message transmission in the parallel computer, is one of the most important factors to improve the performance of the parallel computer. Since parallel computers have high failure rate, many fault-tolerant switching methods have been proposed. The existing methods have problems, however, such as low communication throughput, low fault-tolerant capability, large hardware overhead, and requiring special routing method. This paper proposes fault-tolerant wormhole switching. This switching has backtracking capability by inserting dummy flits after the header flit. This can be used with general fault-tolerant routing which requires backtracking capability to the switching. Computer simulation shows that in a 16 by 16 2D torus, for example, the throughput of the proposed method is almost equal to that of existing methods which require large hardware overhead if the number of the faulty nodes is less then or equal to 64.
Keywords :
data communication; fault tolerant computing; message switching; multiprocessor interconnection networks; parallel architectures; 2D torus system; backtracking capability; dummy flit insertion; fault-tolerant routing; fault-tolerant wormhole switching; header flits; message switching method; message transmission; parallel computers; Circuits; Communication switching; Computer aided instruction; Concurrent computing; Delay; Fault tolerance; Fault tolerant systems; Hardware; Physical layer; Routing;
Conference_Titel :
Defect and Fault Tolerance in VLSI Systems, 2005. DFT 2005. 20th IEEE International Symposium on
Print_ISBN :
0-7695-2464-8
DOI :
10.1109/DFTVS.2005.35