Title :
Bayes analysis for fault location in distributed systems
Author :
Chang, Yu Lo Cyrus ; Lander, Leslie C. ; Lu, Horng-Shing ; Wells, Martin T.
Author_Institution :
Dept. of Comput. Sci. & Electr. Eng., Tennessee Univ., Chattanooga, TN, USA
fDate :
9/1/1994 12:00:00 AM
Abstract :
The authors propose a simple and practical probabilistic model, using multiple incomplete test concepts, for fault location in distributed systems using a Bayes analysis procedure. Since it is easier to compare test results among processing units, their model is comparison-based. This approach is realistic and complete in the sense that it does not assume conditions such as permanently faulty units, complete tests, and perfect or nonmalicious environments. It can handle, without any overhead, fault-free systems so that the test procedure can be used to monitor a functioning system. Given a system S with a specific test graph, the corresponding conditional distribution between the comparison test results (syndrome) and the fault patterns of S can be generated. To avoid the complex global Bayes estimation process, the authors develop a simple bitwise Bayes algorithm for fault location of S, which locates system failures with linear complexity, making it suitable for hard real-time systems. Hence, their approach is appealing both from the practical and theoretical points of view
Keywords :
Bayes methods; failure analysis; fault location; probability; reliability theory; Bayes analysis; bitwise Bayes algorithm; comparison-based model; distributed systems; fault location; probabilistic model; real-time; reliability; Associate members; Fault diagnosis; Fault location; Inference algorithms; Multiprocessing systems; Real time systems; Statistical analysis; Statistical distributions; System testing; Test pattern generators;
Journal_Title :
Reliability, IEEE Transactions on