Title :
Set and array based hybrid data structure solution for Frequent Pattern Mining
Author :
Neha Dwivedi;Srinivasa Rao Satti
Author_Institution :
Seoul National University, Korea
Abstract :
The problem of Frequent Pattern Mining has been widely studied in the literature because of its numerous applications to a variety of data mining problems such as clustering and classification. In this paper, a new vertical format mining algorithm HybridDSItr has been proposed. The algorithm uses a hybrid data structure HybridDS to store the dataset in a compact fashion. It uses an iterative procedure to reduce intermediate candidate generation and save memory and time. Experimental studies have been performed to compare the new algorithm with the (trie based) FP-Growth algorithm and the (vertical format based) Eclat algorithm. The experimental results confirm following observations for sparse datasets. The algorithm exhibits better performance in terms of time as compared to both Eclat and FP-Growth algorithms. It exhibits better performance in terms of memory as compared to FP-Growth and similar or better performance than Eclat algorithm. This new approach can be applied to improve memory and time efficiency of existing vertical format based mining algorithms.
Keywords :
"Itemsets","Generators"
Conference_Titel :
Digital Information Management (ICDIM), 2015 Tenth International Conference on
DOI :
10.1109/ICDIM.2015.7381879