Title :
Mining compressed frequent itemsets over data stream in sliding windows
Author :
Zhao, Li ; Tong, Yongxin ; Yu, Dan ; Ma, Shilong ; Chen, Mengdong
Author_Institution :
State Key Lab. of Software Dev. Environ., Beihang Univ., Beijing, China
Abstract :
Recent studies have shown mining compressed frequent itemset patterns provides more benefits than mining the closed frequent patterns, since mining compressed frequent itemset patterns leads to more compact and representative result sets. Especially, it is quite meaningful in the environment of data stream where limited memory space and computation quality are major challenges. In this paper, the problem of mining compressed frequent itemset patterns over a data stream sliding windows is presented and studied. Firstly, a novel data structure CP-Tree (compressed pattern tree) is designed to maintain a dynamically selected set of compressed frequent itemset patterns over sliding window. Secondly, an efficient algorithm CFPstream (compressing frequent patterns over stream) is developed to discover compressed frequent itemset patterns in data stream sliding windows incrementally. Finally, some optimization techniques are adopted in CFPstream to speed up the algorithm and prune search space. Experiments on both real and synthetic data sets show that CFPstream outperforms representative algorithms for the state-of-the-art approaches.
Keywords :
data compression; data mining; tree data structures; CFPstream algorithm; CP-tree; compressed frequent itemset pattern mining; compressed pattern tree; data stream sliding windows; data structure; optimization techniques; synthetic datasets; Data analysis; Data mining; Data structures; Databases; Explosives; Itemsets; Programming; Tree data structures; Wireless sensor networks; data mining; data stream; sliding window;
Conference_Titel :
Intelligent Computing and Intelligent Systems, 2009. ICIS 2009. IEEE International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-4754-1
Electronic_ISBN :
978-1-4244-4738-1
DOI :
10.1109/ICICISYS.2009.5358398