DocumentCode :
1150240
Title :
A Diagnosis Algorithm for Distributed Computing Systems with Dynamic Failure and Repair
Author :
Hosseini, S.H. ; Kuhl, Jon G. ; Reddy, Sudhakar M.
Author_Institution :
Department of Electrical Engineering and Computer Science, University of Wisconsin
Issue :
3
fYear :
1984
fDate :
3/1/1984 12:00:00 AM
Firstpage :
223
Lastpage :
233
Abstract :
The problem of designing distributed fault-tolerant computing systems is considered. A model in which the network nodes are assumed to possess the ability to "test" certain other network facilities for the presence of failures is employed. Using this model, a distributed algorithm is presented which allows all the network nodes to correctly reach independent diagnoses of the condition (faulty or fault-free) of all the network nodes and internode communication facilities, provided the total number of failures oes not exceed a given bound. The proposed algorithm allows for the reentry of repaired or replaced faulty facilities back into the network, and it also has provisions for adding new nodes to the system. Sufficient conditions are obtained for designing a distributed fault-tolerant system by employing the given algorithm. The algorithm has the interesting property that it lets as many as all of the nodes and internode communication facilities fail, but upon repair or replacement of faulty facilities, the system can converge to normal operation if no more than a certain number of facilities remain faulty.
Keywords :
Computer networks; distributed systems; fault-tolerance; self-diagnosable systems; testing; Algorithm design and analysis; Application software; Automatic testing; Computer network reliability; Distributed algorithms; Distributed computing; Fault tolerant systems; Hardware; Sufficient conditions; System testing; Computer networks; distributed systems; fault-tolerance; self-diagnosable systems; testing;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.1984.1676419
Filename :
1676419
Link To Document :
بازگشت