DocumentCode :
244324
Title :
Mining Historical Issue Repositories to Heal Large-Scale Online Service Systems
Author :
Rui Ding ; Qiang Fu ; Jian Guang Lou ; Qingwei Lin ; Dongmei Zhang ; Tao Xie
fYear :
2014
fDate :
23-26 June 2014
Firstpage :
311
Lastpage :
322
Abstract :
Online service systems have been increasingly popular and important nowadays. Reducing the MTTR (Mean Time to Restore) of a service remains one of the most important steps to assure the user-perceived availability of the service. To reduce the MTTR, a common practice is to restore the service by identifying and applying an appropriate healing action. In this paper, we present an automated mining-based approach for suggesting an appropriate healing action for a given new issue. Our approach suggests an appropriate healing action by adapting healing actions from the retrieved similar historical issues. We have applied our approach to a real-world and large-scale product online service. The studies on 243 real issues of the service show that our approach can effectively suggest appropriate healing actions (with 87% accuracy) to reduce the MTTR of the service. In addition, according to issue characteristics, we further study and categorize issues where automatic healing suggestion faces difficulties.
Keywords :
Web services; data mining; fault tolerant computing; MTTR; automated mining-based approach; automatic healing; healing action; historical issue repositories; large-scale product online service systems; mean time to restore; real-world product online service; service restoration; user-perceived availability; Availability; Correlation; Lattices; Manuals; Measurement; Mutual information; Vectors; Online service system; healing action; incident management; issue repository;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Dependable Systems and Networks (DSN), 2014 44th Annual IEEE/IFIP International Conference on
Conference_Location :
Atlanta, GA
Type :
conf
DOI :
10.1109/DSN.2014.39
Filename :
6903589
Link To Document :
بازگشت