• DocumentCode
    3250448
  • Title

    Predicting rare events in temporal domains

  • Author

    Vilalta, Ricardo ; Ma, Sheng

  • Author_Institution
    Dept. of Comput. Sci., Houston Univ., TX, USA
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    474
  • Lastpage
    481
  • Abstract
    Temporal data mining aims at finding patterns in historical data. Our work proposes an approach to extract temporal patterns from data to predict the occurrence of target events, such as computer attacks on host networks, or fraudulent transactions in financial institutions. Our problem formulation exhibits two major challenges: 1) we assume events being characterized by categorical features and displaying uneven inter-arrival times; such an assumption falls outside the scope of classical time-series analysis, 2) we assume target events are highly infrequent; predictive techniques must deal with the class-imbalance problem. We propose an efficient algorithm that tackles the challenges above by transforming the event prediction problem into a search for all frequent eventsets preceding target events. The class imbalance problem is overcome by a search for patterns on the minority class exclusively; the discrimination power of patterns is then validated against other classes. Patterns are then combined into a rule-based model for prediction. Our experimental analysis indicates the types of event sequences where target events can be accurately predicted.
  • Keywords
    data mining; set theory; categorical features; class-imbalance problem; computer attacks; discrimination power; financial institutions; fraudulent transactions; frequent eventsets; host networks; minority class; predictive techniques; rare events prediction; rule-based model; temporal data mining; temporal domains; temporal patterns extraction; uneven inter-arrival times; Computer crime; Computer displays; Computer science; Knowledge based systems; Network servers; Particle separators; Speech recognition; Target recognition; Testing; USA Councils;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
  • Print_ISBN
    0-7695-1754-4
  • Type

    conf

  • DOI
    10.1109/ICDM.2002.1183991
  • Filename
    1183991