DocumentCode
3683089
Title
Improved Research to K-means Initial Cluster Centers
Author
Zhang Min;Duan Kai-fei
Author_Institution
Coll. of Inf. &
fYear
2015
Firstpage
349
Lastpage
353
Abstract
K-means in the field of clustering analysis algorithms is a kind of more traditional algorithm. It exists many shortcomings. For example, K value is easily affected by man-made subjective factors, and the algorithm is easy to fall into a local optimal solution, and the clustering result is not stable, etc, And K-means++ algorithm as the classic improved algorithm of K-means algorithm, but there is still a phenomenon of unstable cluster center. This paper is a kind of improvement aimed at the shortcoming of K-means++ algorithm, which introduces the concept of the variance in probability and mathematical statistics. Variance reflects the degree of density between samples and other samples. In the K-means++ algorithm when you select the first initial clustering center, you need to select minimum variance of sample points, which is in the position of the largest sample density, then you select the next cluster centers based on the weight method of D2 which is described in the K-means++ algorithm. Experimental results show the accuracy is higher and stability is better.
Keywords
"Clustering algorithms","Accuracy","Algorithm design and analysis","Iris","Machine learning algorithms","Data collection","Probability"
Publisher
ieee
Conference_Titel
Frontier of Computer Science and Technology (FCST), 2015 Ninth International Conference on
Type
conf
DOI
10.1109/FCST.2015.61
Filename
7314704
Link To Document