Title :
ART: robustness of meshes and tori for parallel and distributed computation
Author :
Yeh, Chi-Hsiang ; Parhami, Behrooz
Author_Institution :
Dept. of Electr. & Comput. Eng., Queen´´s Univ., Canada
Abstract :
We formulate array robustness theorems (ARTs) for efficient computation and communication on faulty arrays. No hardware redundancy is required and no assumption is made about the availability of a complete submesh or subtorus. Based on ARTs, a very wide variety of problem, including sorting, FFT, total exchange, permutation, and some matrix operations, can be solved with a slowdown factor of 1+o(1). The number of faults tolerated by ARTs ranges from o(min (n1-1d/, n/d, n/h)) for n-ary d-cubes with worst-case faults to as large as o(N) for most N-node 2-D meshes or tori with random faults, where h is the number of data items per processor The resultant running times are the best results reported thus far for solving many problems on faulty arrays. Based on ARTs and several other components such as robust libraries, the priority emulation discipline, and X´Y´ routing, we introduce the robust adaptation interface layer (RAIL) as a middleware between ordinary algorithms/programs and the faulty network/hardware. In effect, RAIL provides a virtual fault-free network to higher layers, while ordinary algorithms/programs are transformed through RAIL into corresponding robust algorithms/programs that can run on faulty networks.
Keywords :
fault tolerant computing; matrix multiplication; multiprocessor interconnection networks; parallel processing; sorting; telecommunication network routing; FFT; RAIL; array robustness theorems; distributed computation; fault-free arrays; faulty arrays; matrix operations; meshes; middleware; parallel computation; permutation; random faults; robust adaptation interface layer; slowdown factor; sorting; tori; total exchange; Art; Concurrent computing; Distributed computing; Hardware; Libraries; Rails; Redundancy; Robustness; Sorting; Subspace constraints;
Conference_Titel :
Parallel Processing, 2002. Proceedings. International Conference on
Print_ISBN :
0-7695-1677-7
DOI :
10.1109/ICPP.2002.1040903