DocumentCode :
2077140
Title :
FDAE: A f̲ailure d̲etector for a̲synchronous e̲vents
Author :
Farruggia, Alfonso ; Ortolani, Marco ; Re, Giuseppe Lo
Author_Institution :
Dept. of Comput. Eng., Univ. of Palermo, Palermo, Italy
fYear :
2010
fDate :
16-18 Aug. 2010
Firstpage :
197
Lastpage :
202
Abstract :
Detecting element failures is a relevant issue in distributed systems. A fault tolerant system needs to detect a failure and recover from it promptly. In fact, traditional approaches to fault tolerance are usually not completely free from errors during the failure detection phase; a good failure detector is thus a very important component of them to minimize these errors. In this paper we present a failure detector able to monitor both asynchronous and synchronous elements of a distributed system by exchanging messages with the monitored elements. In order to assess the health status of monitored elements our failure detector relies on a simple query/ACK mechanism, which however requires a reliable timeout estimate in order to properly set the monitoring interval. To this purpose our failure detector uses the history of past estimates to compute new values for both quantities. The model proposed here introduces a new label to tag monitored elements, besides those used in traditional failures detectors. To evaluate this work, we compared it with two other algorithms by computing performance metrics, such as specificity and sensitivity, and by considering the number of required control packets. We also compared the performance of the failure detectors by computing their detection time.
Keywords :
distributed processing; fault tolerant computing; system monitoring; ACK mechanism; asynchronous element monitoring; asynchronous events; distributed system; element failure detection; failure detector; fault tolerant system; health status assessment; query mechanism; synchronous element monitoring; Monitoring;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Networked Computing and Advanced Information Management (NCM), 2010 Sixth International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4244-7671-8
Electronic_ISBN :
978-89-88678-26-8
Type :
conf
Filename :
5572274
Link To Document :
بازگشت