• DocumentCode
    2369121
  • Title

    Reliable detection of episodes in event sequences

  • Author

    Gwadera, Robert ; Atallah, Mikhail ; Szpankowski, Wojciech

  • Author_Institution
    Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN, USA
  • fYear
    2003
  • fDate
    19-22 Nov. 2003
  • Firstpage
    67
  • Lastpage
    74
  • Abstract
    Suppose one wants to detect "bad" or "suspicious" subsequences in event sequences. Whether an observed pattern of activity (in the form of a particular subsequence) is significant and should be a cause for alarm, depends on how likely it is to occur fortuitously. A long enough sequence of observed events will almost certainly contain any subsequence, and setting thresholds for alarm is an important issue in a monitoring system that seeks to avoid false alarms. Suppose a long sequence T of observed events contains a suspicious subsequence pattern S within it, where the suspicious subsequence S consists of m events and spans a window of size w within T. We address the fundamental problem: is a certain number of occurrences of a particular subsequence unlikely to be fortuitous (i.e., indicative of suspicious activity)? If the probability of fortuitous occurrences is high and an automated monitoring system flags it as suspicious anyway, then such a system will suffer from generating too many false alarms. We quantify the probability of such an S occurring in T within a window of size w, the number of distinct windows containing S as a subsequence, the expected number of such occurrences, its variance, and establishes its limiting distribution that allows to set up an alarm threshold so that the probability of false alarms is very small. We report on experiments confirming the theory and showing that we can detect bad subsequences with low false alarm rate.
  • Keywords
    pattern matching; probability; reliability; security of data; alarm threshold; event sequence; monitoring system; probability; reliability; security of data; subsequence pattern detection; Computerized monitoring; Contracts; Data mining; Event detection; Information security; Intrusion detection; National security; Sequences; Transaction databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
  • Print_ISBN
    0-7695-1978-4
  • Type

    conf

  • DOI
    10.1109/ICDM.2003.1250904
  • Filename
    1250904