DocumentCode :
2050466
Title :
Tree partition based parallel frequent pattern mining on shared memory systems
Author :
Chen, Dehao ; Lai, Chunrong ; Hu, Wei ; Chen, Wenguang ; Zhang, Yimin ; Zheng, Weimin
Author_Institution :
Dept. of Comput. Sci., Tsinghua Univ., Beijing, China
fYear :
2006
fDate :
25-29 April 2006
Abstract :
In this paper, we present a tree-partition algorithm for parallel mining of frequent patterns. Our work is based on FP-Growth algorithm, which is constituted of tree-building stage and mining stage. The main idea is to build only one FP-Tree in the memory, partition it into several independent parts and distribute them to different threads. A heuristic algorithm is devised to balance the workload. Our algorithm can not only alleviate the impact of locks during the tree-building stage, but also avoid the overhead that do great harm to the mining stage. We present the experiments on different kinds of datasets and compare the results with other parallel approaches. The results suggest that our approach has great advantage in efficiency, especially on certain kinds of datasets. As the number of processors increases, our parallel algorithm shows good scalability.
Keywords :
data mining; parallel algorithms; pattern recognition; shared memory systems; FP-growth algorithm; FP-tree; heuristic algorithm; parallel algorithm; parallel frequent pattern mining; shared memory systems; tree partition based pattern mining; tree-building stage; Algorithm design and analysis; Association rules; Computer science; Data mining; Heuristic algorithms; Parallel algorithms; Partitioning algorithms; Scalability; Transaction databases; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
Print_ISBN :
1-4244-0054-6
Type :
conf
DOI :
10.1109/IPDPS.2006.1639620
Filename :
1639620
Link To Document :
بازگشت