DocumentCode
3529667
Title
PGMCLU: A novel parallel grid-based clustering algorithm for multi-density datasets
Author
Chen Xiaoyun ; Chen Yi ; Qi Xiaoli ; Yue Min ; He Yanshan
Author_Institution
Sch. of Inf. Sci. & Eng., Lanzhou Univ., Lanzhou, China
fYear
2009
fDate
23-24 Aug. 2009
Firstpage
166
Lastpage
171
Abstract
Clustering is one of the basic data mining tasks. Clustering high-dimensional and massive data points is a particularly important task in cluster analysis. But some existing clustering algorithms are merely suitable for small and medium sized datasets. Meanwhile, clustering multi-density datasets is also a very difficult task for some clustering methods. In this paper, to address these issues, we present a novel parallel grid-based clustering algorithm for multi-density datasets, called PGMCLU, based on the idea of data parallelism and merging local clusters. The proposed algorithm uses new measure, called grid compactness, which reflects the degree of tightness between data points within grid. Furthermore, it introduces the notion of grid feature for summarizing the information about grid, and proposes the novel approaches of data partition, local clustering and merging local clusters. Extensive theoretical analysis and experiment results on both real and synthetic datasets show that PGMCLU algorithm is effective and scalable, and has approximately linear speedup.
Keywords
data mining; grid computing; parallel algorithms; pattern clustering; PGMCLU; PGMCLU algorithm; cluster analysis; data mining; data parallelism; data partitioning; grid compactness; local cluster merging; multidensity dataset; parallel grid-based clustering algorithm; Algorithm design and analysis; Clustering algorithms; Data engineering; Data mining; Image analysis; Machine learning algorithms; Merging; Partitioning algorithms; Personal communication networks; Programming profession;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Society, 2009. SWS '09. 1st IEEE Symposium on
Conference_Location
Lanzhou
Print_ISBN
978-1-4244-4157-0
Electronic_ISBN
978-1-4244-4158-7
Type
conf
DOI
10.1109/SWS.2009.5271791
Filename
5271791
Link To Document