DocumentCode :
2849820
Title :
Moment: maintaining closed frequent itemsets over a stream sliding window
Author :
Chi, Yun ; Wang, Haixun ; Yu, Philip S. ; Muntz, Richard R.
Author_Institution :
Dept. of Comput. Sci., California Univ., Los Angeles, CA, USA
fYear :
2004
fDate :
1-4 Nov. 2004
Firstpage :
59
Lastpage :
66
Abstract :
This paper considers the problem of mining closed frequent itemsets over a sliding window using limited memory space. We design a synopsis data structure to monitor transactions in the sliding window so that we can output the current closed frequent itemsets at any time. Due to time and memory constraints, the synopsis data structure cannot monitor all possible itemsets. However, monitoring only frequent itemsets make it impossible to detect new itemsets when they become frequent. In this paper, we introduce a compact data structure, the closed enumeration tree (CET), to maintain a dynamically selected set of item-sets over a sliding-window. The selected itemsets consist of a boundary between closed frequent itemsets and the rest of the itemsets. Concept drifts in a data stream are reflected by boundary movements in the CET. In other words, a status change of any itemset (e.g., from non-frequent to frequent) must occur through the boundary. Because the boundary is relatively stable, the cost of mining closed frequent item-sets over a sliding window is dramatically reduced to that of mining transactions that can possibly cause boundary movements in the CET. Our experiments show that our algorithm performs much better than previous approaches.
Keywords :
data mining; data structures; transaction processing; trees (mathematics); Moment; closed enumeration tree; closed frequent itemset mining; compact data structure; limited memory space; memory constraint; mining transactions; stream sliding window; synopsis data structure; time constraint; Association rules; Computer science; Costs; Data mining; Data structures; Itemsets; Memory management; Monitoring; Time factors; Tree data structures;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
Conference_Location :
Brighton, UK
Print_ISBN :
0-7695-2142-8
Type :
conf
DOI :
10.1109/ICDM.2004.10084
Filename :
1410267
Link To Document :
بازگشت