Title :
ACID: Adaptive, convergent, and intelligent fault monitoring for distributed systems
Author :
Hussain, Shujaat ; Qadir, Muhammad Abdul
Author_Institution :
Center for Distrib. & Semantic Comput., Mohammad Ali Jinnah Univ., Islamabad
Abstract :
Fault monitoring is an important issue to be addressed for fault tolerant distributed system. With the help of an efficient fault monitoring scheme, it would be easy to determine the crash and quickly take the recovery steps. Fault monitor typically detects faults by sending and receiving messages to remote objects. One of the major responsibilities of the monitor is to adapt timeouts according to the dynamic network and system conditions, and set timeouts very close to the real delays in the system. The timeouts must not fluctuate with large amplitudes around the actual time delays. It should not adapt with sudden transients behaviors. Otherwise the number of false alarms would increase, which may trigger a heavy fault recovery mechanisms. The relationship between timeouts and monitoring intervals need to be managed intelligently. Our technique adapts the timeout on the previous history which gives us a fair idea about the work load and we use it to our advantage. When we tested the existing schemes against the three points just mentioned, to our surprise, none of the scheme complies with these points. We experimented with our technique along with some other proposed techniques, our scheme; ACID gave very good results when compared with the schemes.
Keywords :
delays; fault diagnosis; fault tolerant computing; system recovery; failure detection; false fault recovery mechanisms; fault monitoring; fault tolerant distributed system; time delays; timeouts; Computer crashes; Condition monitoring; Delay effects; Delay systems; Fault detection; Fault tolerant systems; History; Object detection; Remote monitoring; Testing; Fault Tolerance; failure detection; fault monitoring; monitoring interval; timeout;
Conference_Titel :
Emerging Technologies, 2008. ICET 2008. 4th International Conference on
Conference_Location :
Rawalpindi
Print_ISBN :
978-1-4244-2210-4
Electronic_ISBN :
978-1-4244-2211-1
DOI :
10.1109/ICET.2008.4777487