DocumentCode :
2393542
Title :
The scheme design of distributed systems service fault management based on active probing
Author :
Deng, Li ; Qu, Xiaoyan ; Ma, Dengwu
Author_Institution :
Dept. of Armament Sci. & Technol., Naval Aeronaut. & Astronaut. Univ., Yantai, China
fYear :
2012
fDate :
19-20 May 2012
Firstpage :
1644
Lastpage :
1649
Abstract :
Service fault management in distributed computer systems and networks is a difficult task that requires high efficient inferences from mass data. In this paper, we propose a corresponding solution. Firstly, challenges of distributed systems service fault management are analyzed, and a multilayer model is recommended. Then, a dependency matrix to represent the causal relationship between faults and probes is defined and the framework of fault management is built. After these, a service fault management scheme using active probing is proposed. This scheme is composed of two phases: fault detection and fault localization. In first phase, we propose a probe selection algorithm, which selects a minimal set of probes while remaining a high probability of fault detection. In second phase, we propose a fault localization probe selection algorithm, which selects probes to obtain more system information based on the symptoms observed in previous phase. Finally, the instance proves the validity and efficiency of our scheme.
Keywords :
distributed processing; inference mechanisms; software fault tolerance; active probing; dependency matrix; distributed systems service fault management; fault localization; inferences; multilayer model; probe selection algorithm; Algorithm design and analysis; Fault detection; Monitoring; Nonhomogeneous media; Probes; Quality of service; Software; active probing; distributed systems; fault management; service managemet;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems and Informatics (ICSAI), 2012 International Conference on
Conference_Location :
Yantai
Print_ISBN :
978-1-4673-0198-5
Type :
conf
DOI :
10.1109/ICSAI.2012.6223356
Filename :
6223356
Link To Document :
بازگشت