Title :
On fault-tolerant structure, distributed fault-diagnosis, reconfiguration, and recovery of the array processors
Author :
Hosseini, Seyed H.
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Wisconsin Univ., Milwaukee, WI, USA
fDate :
7/1/1989 12:00:00 AM
Abstract :
A study is made of the design of fault-tolerant array processors. It is shown how hardware redundancy can be used in the existing structures in order to make them capable of withstanding the failure of some of the array links and processors. Distributed fault-tolerance schemes are introduced for the diagnosis of the faulty elements, reconfiguration, and recovery of the array. Fault tolerance is maintained by the cooperation of processors in a decentralized form of control without the participation of any type of hardcore or fault-free central controller such as a host computer. Time redundancy is utilized by assigning the functions of the failed processors to fault-free processors
Keywords :
distributed processing; fault tolerant computing; parallel processing; array processors; decentralized form; distributed fault-diagnosis; fault-tolerant structure; faulty elements; hardware redundancy; reconfiguration; recovery; Application software; Centralized control; Computer errors; Distributed computing; Fabrication; Fault diagnosis; Fault tolerance; Helium; Redundancy; Switches;
Journal_Title :
Computers, IEEE Transactions on