Title :
Failure prediction in IBM BlueGene/L event logs
Author :
Zhang, Yanyong ; Sivasubramaniam, Anand
Author_Institution :
ECE Dept., Rutgers Univ., Piscataway, NJ
Abstract :
In this paper, we present our effort in developing a failure prediction model based on event logs collected from IBM BlueGene/L. We first show how the event records can be converted into a data set that is appropriate for running classification techniques. Then we apply classifiers on the data, including RIPPER (a rule-based classifier), support vector machines (SVMs), a traditional Nearest Neighbor method, and a customized nearest neighbor method. We show that the customized nearest neighbor approach can outperform RIPPER and SVMs in terms of both coverage and precision. The results suggest that the customized nearest neighbor approach can be used to alleviate the impact of failures.
Keywords :
failure analysis; parallel machines; pattern classification; support vector machines; IBM BlueGene/L event logs; RIPPER; classification techniques; failure prediction; nearest neighbor method; rule-based classifier; support vector machines; Accuracy; Data mining; Fault tolerant systems; Large-scale systems; Machine learning; Nearest neighbor searches; Predictive models; Runtime; Support vector machine classification; Support vector machines;
Conference_Titel :
Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-1693-6
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2008.4536397