Author_Institution :
Sch. of Electr. Eng. Comput. Sci., Washington State Univ., Pullman, WA, USA
Abstract :
We consider the problem of computing a global commutative and associative operation, also known as semi-group operation, (such as addition and multiplication) on a faulty hypercube. In particular, we study the problem of performing such an operation in an n-dimensional SIMD hypercube, Qn, with up to n-1 node and/or link faults. In an SIMD hypercube, during a communication step, nodes can exchange information with their neighbors only across a specific dimension. Given a set of at most n-1 faults, we develop an ordering d1,d2,...,d1 of n dimensions, depending on where the faults are located. An important and useful property of this dimension ordering is the following: if the n-cube is partitioned into k-subcubes using the first k dimensions of this ordering, namely d 1, d2,..., dn for any 2⩽k⩽n, then each k-subcube in the partition contains at most k-1 faults. We use this result to develop algorithms for global sum. These algorithms use 3n-2, n+3 log n+3 log log n, and n+log n+d2 log log n+O(log log log n) time steps, respectively
Keywords :
fault tolerant computing; hypercube networks; associative reduction operations; commutative operation; faulty SIMD hypercubes; faulty hypercube; Application software; Commutation; Degradation; Fault tolerance; High performance computing; Hypercubes; Parallel processing; Partitioning algorithms; Robustness; Space exploration;