DocumentCode
123367
Title
A K-means clustering with optimized initial center based on Hadoop platform
Author
Kunhui Lin ; Xiang Li ; Zhongnan Zhang ; Jiahong Chen
Author_Institution
Software Sch., Xiamen Univ., Xiamen, China
fYear
2014
fDate
22-24 Aug. 2014
Firstpage
263
Lastpage
266
Abstract
With the explosive growth of data, the traditional clustering algorithms running on separate servers can not meet the demand. To solve the problem, more and more researchers implement the traditional clustering algorithms on the cloud computing platforms, especially for K-means clustering. But, few researchers pay attention to the K-means clustering structure, and most of researchers optimized the model of the cloud computing platform to raise the computing speed of K-means clustering. However the problem of instability caused by the random initial centers still exists. In this paper, we propose a K-means clustering algorithm with optimized initial centers based on data dimensional density. This method avoids the deficiency of the random initial centers and improves the stability of the K-means clustering. The experimental results show that the approach achieves a good performance on K-means, and improves the accuracy of K-means clustering on the test set.
Keywords
cloud computing; data handling; pattern clustering; Hadoop platform; K-means clustering; MapReduce programming model; cloud computing platforms; data dimensional density; optimized initial center; random initial centers; Clustering algorithms; Computers; Educational institutions; Density; Initial center; K-means clustering; MapReduce;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science & Education (ICCSE), 2014 9th International Conference on
Conference_Location
Vancouver, BC
Print_ISBN
978-1-4799-2949-8
Type
conf
DOI
10.1109/ICCSE.2014.6926466
Filename
6926466
Link To Document