Title :
High performance frequent pattern mining on multi-core cluster
Author :
Vu, Lan ; Alaghba, Gita
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Colorado Denver, Denver, CO, USA
Abstract :
Mining frequent patterns is a fundamental data mining task with numerous practical applications such as consumer market-basket analysis, web mining, and network intrusion detection. When database size is large, executing this mining task on a personal computer is non-trivial because of huge computational time and memory consumption. In our previous research, we proposed a novel algorithm named FEM which is more efficient than well-known algorithms like Apriori, Eclat or FP-growth in discovering frequent patterns from both dense and sparse databases. However, in order to apply FEM to applications with large-scale databases, it is essential to develop new parallel algorithms that are based on FEM and deploy this mining task on high performance computer systems. In this paper, we present a new method named PFEM that parallelizes the FEM algorithm for a cluster of multi-core machines. Our proposed method allows each machine in the cluster execute an independent mining workload to improve the scalability. Computations within a multi-core machine use shared memory model to reduce communication overhead and maintain load balance. With the collaboration of both distributed memory and shared memory computational models, PFEM can adapt well to large computer systems with many multi-core.
Keywords :
data mining; database management systems; distributed memory systems; finite element analysis; groupware; pattern clustering; resource allocation; shared memory systems; FEM; collaboration; data mining; database size; distributed memory computational models; frequent pattern mining; load balance; multi-core cluster; multi-core machines; personal computer; shared memory computational models; Algorithm design and analysis; Association rules; Computational modeling; Finite element methods; Itemsets; association rule mining; data mining; frequent pattern mining; multi-core cluster; parallel algorithm; transactional databases;
Conference_Titel :
Collaboration Technologies and Systems (CTS), 2012 International Conference on
Conference_Location :
Denver, CO
Print_ISBN :
978-1-4673-1381-0
DOI :
10.1109/CTS.2012.6261005