Title :
DISC: Efficient Uncertain Frequent Pattern Mining with Tightened Upper Bounds
Author :
MacKinnon, Richard Kyle ; Strauss, Teagan D. ; Leung, Carson Kai-Sang
Author_Institution :
Dept. of Comput. Sci., Univ. of Manitoba, Winnipeg, MB, Canada
Abstract :
UF-growth is a tree-based exact algorithm for mining frequent patterns from uncertain data. While it directly calculates the expected support of an item set, it requires a significant amount of storage space to capture all existential probability values among the items. To eliminate the extra space requirement of UF-growth, the CUF-growth algorithm combines nodes with the same item by storing an upper bound on expected support. In this paper, we introduce two new algorithms for achieving a tighter upper bound than CUF-growth, and we evaluate the trade-off between storing more information to further tighten the bound and its effect on the performance of the algorithm. Experimental results show the effectiveness of our algorithms.
Keywords :
data mining; pattern recognition; probability; tree data structures; trees (mathematics); CUF-growth algorithm; DISC; information storage; item set; probability values; space requirement; storage space; tree-based exact algorithm; uncertain data; uncertain frequent pattern mining; Algorithm design and analysis; Approximation algorithms; Clustering algorithms; Conferences; Data mining; Databases; Upper bound; Association analysis; data mining algorithms; frequent patterns; tree structures; uncertain data;
Conference_Titel :
Data Mining Workshop (ICDMW), 2014 IEEE International Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-1-4799-4275-6
DOI :
10.1109/ICDMW.2014.129