Title :
A Distributed Algorithm of Density-Based Subspace Frequent Closed Itemset Mining
Author :
Huaiguo Fu ; O Foghlu, Micheal
Author_Institution :
Telecommun. Software & Syst. Group, Waterford Inst. of Technol., Waterford, Ireland
Abstract :
Large, dense-packed and high-dimensional data mining is one challenge of frequent closed itemset mining for association analysis, although frequent closed itemset mining is an efficient approach to reduce the complexity of mining frequent itemsets. This paper proposes a distributed algorithm to address the challenge of discovering frequent closed itemsets in large, dense-packed and high-dimensional data. The algorithm partitions the search space off requent closed itemsets into independent nonoverlapping subspaces that can be extracted independently to generate frequent closed itemsets. The algorithm can generate frequent closed itemsets according to dense priority: the closed itemset more dense or more frequent will be generated preferentially. The experimental results show the algorithm is efficient to extract frequent closed itemsets in large data.
Keywords :
computational complexity; data mining; distributed algorithms; very large databases; association rule analysis; complexity reduction; density-based subspace frequent closed itemset mining; distributed algorithm; high-dimensional data mining; large dense-packed dataset; Algorithm design and analysis; Complexity theory; Data mining; Distributed algorithms; Itemsets; Lattices; Partitioning algorithms; Association analysis; Concept lattice; Distributed algorithm; Frequent closed itemset mining; Partition;
Conference_Titel :
High Performance Computing and Communications, 2008. HPCC '08. 10th IEEE International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-0-7695-3352-0
DOI :
10.1109/HPCC.2008.147