DocumentCode :
2437280
Title :
Research of Top-N Frequent Closed Itemsets Mining Algorithm
Author :
Liu, Lizhi ; Liu, Jun
Author_Institution :
Sch. of Comput. Sci. & Eng., Wuhan Inst. of Technol., Wuhan
Volume :
2
fYear :
2008
fDate :
19-20 Dec. 2008
Firstpage :
168
Lastpage :
172
Abstract :
A mining top-n frequent closed itemsets of length no less than min_l algorithm is introduced by this paper, where n is the desired number of frequent closed itemsets to be mined, and min_l is the minimal length of each itemset. An efficient algorithm, called TFP, is developed for mining such itemsets without mins_support. Starting at min_support=0 and by making use of the length constraint and the properties of top-n frequent closed itemsets, min_support can be raised effectively and FP-Tree can be pruned dynamically both during and after the construction of the tree using our two proposed methods: the closed node count and descendant_sum. Moreover, mining is further speeded up by employing a bottom-up combined FP-Tree traversing strategy, a set of search space pruning methods, a fast 2-level hash-indexed result tree, and a novel closed itemset verification scheme. Our extensive performance study shows that TFP has high performance and linear scalability in terms of the database size.
Keywords :
data mining; trees (mathematics); FP-Tree traversing strategy; closed itemset verification scheme; fast 2-level hash-indexed result tree; search space pruning methods; top-N frequent closed itemsets mining algorithm; Application software; Computational intelligence; Computer industry; Computer science; Conferences; Data mining; Itemsets; Mining industry; Scalability; Transaction databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Industrial Application, 2008. PACIIA '08. Pacific-Asia Workshop on
Conference_Location :
Wuhan
Print_ISBN :
978-0-7695-3490-9
Type :
conf
DOI :
10.1109/PACIIA.2008.399
Filename :
4756758
Link To Document :
بازگشت