Title :
Origin-based fault-tolerant routing in the mesh
Author :
Libeskind-Hadas, R. ; Brandt, Eli
Author_Institution :
Dept. of Comput. Sci., Harvey Mudd Coll., Claremont, CA, USA
Abstract :
The ability to tolerate faults is critical in multi-computers employing large numbers of processors. This paper describes a class of fault-tolerant routing algorithms for n-dimensional meshes that can tolerate large numbers of faults without using virtual channels. We show that these routing algorithms prevent livelock and deadlock while remaining highly adaptive
Keywords :
concurrency control; distributed memory systems; fault tolerant computing; message passing; multiprocessor interconnection networks; reliability; deadlock; fault-tolerant routing algorithms; livelock; n-dimensional meshes; origin-based fault-tolerant routing; virtual channels; Computer science; Concurrent computing; Delay; Distributed computing; Educational institutions; Fault tolerance; Radio access networks; Routing; System recovery; Topology;
Conference_Titel :
High-Performance Computer Architecture, 1995. Proceedings., First IEEE Symposium on
Conference_Location :
Raleigh, NC
Print_ISBN :
0-8186-6445-2
DOI :
10.1109/HPCA.1995.386551