Title :
Efficient Episode Mining with Minimal and Non-overlapping Occurrences
Author :
Zhu, Huisheng ; Wang, Peng ; He, Xianmang ; Li, Yujia ; Wang, Wei ; Shi, Baile
Author_Institution :
Fudan Univ., Shanghai, China
Abstract :
Frequent serial episodes within an event sequence describe the behavior of users or systems about the application. Existing mining algorithms calculate the frequency of an episode based on overlapping or non-minimal occurrences, which is prone to over-counting the support of long episodes or poorly characterizing the followed-by-closely relationship over event types. In addition, due to utilizing the Apriori-style level wise approach, these algorithms are computationally expensive. In this paper, we propose an efficient algorithm MANEPI (Minimal And Non-overlapping EPIsode) for mining more interesting frequent episodes within the given event sequence. The proposed frequency measure takes both minimal and non-overlapping occurrences of an episode into consideration and ensures better mining quality. The introduced depth first search strategy with the Apriori Property for performing episode growth greatly improves the efficiency of mining long episodes because of scanning the given sequence only once and not generating candidate episodes. Moreover, an optimization technique is presented to narrow down search space and speed up the mining process. Experimental evaluation on both synthetic and real-world datasets demonstrates that our algorithms are more efficient and effective.
Keywords :
data mining; Apriori style level wise approach; MANEPI algorithm; Minimal And Non-overlapping EPIsode; efficient episode mining; event sequence; frequent serial episode; nonoverlapping occurrence; Data mining; Event sequence; Frequent episode; Minimal and non-overlapping occurrences; Prefix tree;
Conference_Titel :
Data Mining (ICDM), 2010 IEEE 10th International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-9131-5
Electronic_ISBN :
1550-4786
DOI :
10.1109/ICDM.2010.25