DocumentCode :
2849691
Title :
Detection of significant sets of episodes in event sequences
Author :
Atallah, Mikhail ; Szpankowski, Wojciech ; Gwadera, Robert
Author_Institution :
Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN, USA
fYear :
2004
fDate :
1-4 Nov. 2004
Firstpage :
3
Lastpage :
10
Abstract :
We present a method for a reliable detection of "unusual" sets of episodes in the form of many pattern sequences, scanned simultaneously for an occurrence as a subsequence in a large event stream within a window of size w. We also investigate the important special case of all permutations of the same sequence, which models the situation where the order of events in an episode does not matter, e.g., when events correspond to purchased market basket items. In order to build a reliable monitoring system, we compare obtained measurements to a reference model which in our case is a probabilistic model (Bernoulli or Markov). We first present a precise analysis that leads to a construction of a threshold. The difficulties of carrying out a probabilistic analysis for an arbitrary set of patterns, stems from the possible simultaneous occurrence of many members of the set as subsequences in the same window, the fact that the different patterns typically do have common symbols or common subsequences or possibly common prefixes, and that they may have different lengths. We also report on extensive experimental results, carried out on the Wal-Mart transactions database, that show a remarkable agreement with our theoretical analysis. This paper is an extension of our previous work where we laid out foundation for the problem of the reliable detection of an "unusual" episodes, but did not consider more than one episode scanned simultaneously for an occurrence.
Keywords :
pattern recognition; probability; sequences; set theory; Wal-Mart transactions database; event sequences; large event stream; market basket items; pattern sequences; probabilistic analysis; reference model; reliable monitoring system; significant episode sets; Contracts; Event detection; Information security; Intrusion detection; Monitoring; National security; Pattern analysis; Probability; Sequences; Transaction databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
Print_ISBN :
0-7695-2142-8
Type :
conf
DOI :
10.1109/ICDM.2004.10090
Filename :
1410260
Link To Document :
بازگشت