DocumentCode
3250448
Title
Predicting rare events in temporal domains
Author
Vilalta, Ricardo ; Ma, Sheng
Author_Institution
Dept. of Comput. Sci., Houston Univ., TX, USA
fYear
2002
fDate
2002
Firstpage
474
Lastpage
481
Abstract
Temporal data mining aims at finding patterns in historical data. Our work proposes an approach to extract temporal patterns from data to predict the occurrence of target events, such as computer attacks on host networks, or fraudulent transactions in financial institutions. Our problem formulation exhibits two major challenges: 1) we assume events being characterized by categorical features and displaying uneven inter-arrival times; such an assumption falls outside the scope of classical time-series analysis, 2) we assume target events are highly infrequent; predictive techniques must deal with the class-imbalance problem. We propose an efficient algorithm that tackles the challenges above by transforming the event prediction problem into a search for all frequent eventsets preceding target events. The class imbalance problem is overcome by a search for patterns on the minority class exclusively; the discrimination power of patterns is then validated against other classes. Patterns are then combined into a rule-based model for prediction. Our experimental analysis indicates the types of event sequences where target events can be accurately predicted.
Keywords
data mining; set theory; categorical features; class-imbalance problem; computer attacks; discrimination power; financial institutions; fraudulent transactions; frequent eventsets; host networks; minority class; predictive techniques; rare events prediction; rule-based model; temporal data mining; temporal domains; temporal patterns extraction; uneven inter-arrival times; Computer crime; Computer displays; Computer science; Knowledge based systems; Network servers; Particle separators; Speech recognition; Target recognition; Testing; USA Councils;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
Print_ISBN
0-7695-1754-4
Type
conf
DOI
10.1109/ICDM.2002.1183991
Filename
1183991
Link To Document