DocumentCode :
1153347
Title :
Fast algorithms for frequent itemset mining using FP-trees
Author :
Grahne, Gosta ; Zhu, Jianfei
Author_Institution :
Dept. of Comput. Sci., Concordia Univ., Montreal, Que., Canada
Volume :
17
Issue :
10
fYear :
2005
Firstpage :
1347
Lastpage :
1362
Abstract :
Efficient algorithms for mining frequent itemsets are crucial for mining association rules as well as for many other data mining tasks. Methods for mining frequent itemsets have been implemented using a prefix-tree structure, known as an FP-tree, for storing compressed information about frequent itemsets. Numerous experimental results have demonstrated that these algorithms perform extremely well. In this paper, we present a novel FP-array technique that greatly reduces the need to traverse FP-trees, thus obtaining significantly improved performance for FP-tree-based algorithms. Our technique works especially well for sparse data sets. Furthermore, we present new algorithms for mining all, maximal, and closed frequent itemsets. Our algorithms use the FP-tree data structure in combination with the FP-array technique efficiently and incorporate various optimization techniques. We also present experimental results comparing our methods with existing algorithms. The results show that our methods are the fastest for many cases. Even though the algorithms consume much memory when the data sets are sparse, they are still the fastest ones when the minimum support is low. Moreover, they are always among the fastest algorithms and consume less memory than other methods when the data sets are dense.
Keywords :
data mining; tree data structures; very large databases; FP-array technique; FP-tree data structures; association rules; data mining; frequent itemset mining; prefix-tree structure; sparse data sets; Association rules; Data mining; Data structures; Itemsets; Lattices; Multidimensional systems; Transaction databases; Index Terms- Data mining; association rules.;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2005.166
Filename :
1501819
Link To Document :
بازگشت