Title :
Fault tolerance in hyperbus and hypercube multiprocessors using partitioning scheme
Author :
Wang, Shih-Chang ; Kuo, Sy-Yen
Author_Institution :
Dept. of Electr. Eng., Nat. Taiwan Univ., Taipei, Taiwan
Abstract :
In this paper, the partitioning scheme is used to achieve fault tolerance in hyperbus and hypercube multiprocessors. Unlike other schemes, processor faults are assumed to be randomly distributed. We propose a novel and practical load redistribution method to tolerate processor faults in a hyperbus structure with insignificant overhead (a slowdown of 2 for computation and a slowdown of 3 for communication in the worst case). Standard routing and broadcasting algorithms were implemented on hypercube computers. To achieve fault tolerance, we present routing and broadcasting algorithms for a faulty hypercube with at most n-1 faults. Compared with other existing algorithms, our methods have better performance in most measures
Keywords :
fault tolerant computing; hypercube networks; multiprocessing systems; broadcasting algorithms; fault tolerance; hyperbus; hypercube multiprocessors; load redistribution method; partitioning scheme; Broadcasting; Computer networks; Concurrent computing; Councils; Fault tolerance; Hypercubes; Joining processes; Partitioning algorithms; Routing; Topology;
Conference_Titel :
Parallel and Distributed Systems, 1994. International Conference on
Conference_Location :
Hsinchu
Print_ISBN :
0-8186-6555-6
DOI :
10.1109/ICPADS.1994.590319