Title :
Alert Detection in System Logs
Author :
Oliner, Adam J. ; Aiken, Alex ; Stearley, Jon
Author_Institution :
Stanford Univ., Stanford, CA
Abstract :
We present Nodeinfo, an unsupervised algorithm for anomaly detection in system logs. We demonstrate Nodeinfo´s effectiveness on data from four of the world´s most powerful supercomputers: using logs representing over 746 million processor-hours, in which anomalous events called alerts were manually tagged for scoring, we aim to automatically identify the regions of the log containing those alerts. We formalize the alert detection task in these terms, describe how Nodeinfo uses the information entropy of message terms to identify alerts, and present an online version of this algorithm, which is now in production use. This is the first work to investigate alert detection on (several) publicly-available supercomputer system logs, thereby providing a reproducible performance baseline.
Keywords :
entropy; security of data; Nodeinfo; alert detection; anomaly detection; information entropy; message terms; system logs; unsupervised algorithm; Costs; Data mining; Detection algorithms; Fault detection; Information entropy; Laboratories; Personnel; Production systems; Supercomputers; USA Councils; anomaly detection; fault detection; hpc; information theory; log analysis;
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3502-9
DOI :
10.1109/ICDM.2008.132