DocumentCode :
2130783
Title :
Parallel Hierarchical Clustering on Market Basket Data
Author :
Wang, Baoying ; Ding, Qin ; Rahal, Imad
Author_Institution :
Waynesburg Univ., Waynesburg, PA
fYear :
2008
fDate :
15-19 Dec. 2008
Firstpage :
526
Lastpage :
532
Abstract :
Data clustering has been proven to be a promising data mining technique. Recently, there have been many attempts for clustering market-basket data. In this paper, we propose a parallelized hierarchical clustering approach on market-basket data (PH-Clustering), which is implemented using MPI. Based on the analysis of the major clustering steps, we adopt a partial local and partial global approach to decrease the computation time meanwhile keeping communication time at minimum. Load balance issue is always considered especially at data partitioning stage. Our experimental results demonstrate that PH-Clustering speeds up the sequential clustering with a great magnitude. The larger the data size, the more significant the speedup when the number of processors is large. Our results also show that the number of items has more impact on the performance of PH-Clustering than the number of transactions.
Keywords :
data analysis; message passing; parallel algorithms; pattern clustering; resource allocation; MPI; data clustering; data mining; data partitioning; load balance issue; market basket data; parallel hierarchical clustering; parallelized hierarchical clustering; partial global approach; partial local approach; sequential clustering; Conferences; Data analysis; Data mining; Data structures; Decision making; Educational institutions; Itemsets; Message passing; Velocity measurement; Weight measurement; data mining; hierarchical clustering; market basket data; parallel computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops, 2008. ICDMW '08. IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3503-6
Electronic_ISBN :
978-0-7695-3503-6
Type :
conf
DOI :
10.1109/ICDMW.2008.32
Filename :
4733976
Link To Document :
بازگشت