DocumentCode :
1148823
Title :
Abstractions for Node Level Passive Fault Detection in Distributed Systems
Author :
Oikonomou, Kostas N. ; Kain, Richard Y.
Author_Institution :
Bell Laboratories
Issue :
6
fYear :
1983
fDate :
6/1/1983 12:00:00 AM
Firstpage :
543
Lastpage :
550
Abstract :
We introduce a scheme for passive node-level fault detection in a distributed system. With each system node associate a low-cost, low-complexity observer which monitors the pattern of incoming and outgoing messages and compares it against an abstracted model of the node´s behavior. We develop a fault detection procedure, which is probabilistic because of nondeterminism in the simplified node model. Abstraction reduces model complexity, but renders some errors undetectable by the observer. In the paper we characterize these undetectable errors. Succeeding studies show how to select model abstractions to lower the number of undetectable errors.
Keywords :
Concurrent fault detection; distributed systems; fault detection; Computer errors; Condition monitoring; Costs; Fault detection; Frequency estimation; Missiles; Observers; Probability; Signal design; System testing; Concurrent fault detection; distributed systems; fault detection;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.1983.1676276
Filename :
1676276
Link To Document :
بازگشت