DocumentCode :
245984
Title :
An Improved Semi-supervised K-Means Algorithm Based on Information Gain
Author :
Liu Zhenpeng ; Guo Ding ; Zhang Xizhong ; Wang Xu ; Zhu Xianchao
Author_Institution :
Sch. of Electron. Inf. Eng., Hebei Univ., Baoding, China
fYear :
2014
fDate :
19-21 Dec. 2014
Firstpage :
1960
Lastpage :
1963
Abstract :
The traditional K-means algorithm is sensitive to the initial center, and equates the importance of dimension data for multidimensional data. So it is unable to block the effects of dimensional data dimension, nor can it well reflect the influence of each dimension of clustering. The semi-supervised clustering introduces a small amount of sample points, so that it can significantly reduce the number of iterations, as well as increase the efficiency of clustering accuracy and iteration. This paper introduces ideas of the information gain weighted to the semi-supervised K-means algorithm. By using a small amount of marked samples to the information gain weight calculation and determination of the initial center, the algorithm in this paper obtains the clustering effect of higher quality, and maintains the stability of the cluster.
Keywords :
iterative methods; pattern classification; pattern clustering; cluster stability; clustering accuracy; clustering dimension; dimensional data dimension; improved semisupervised k-means algorithm; information gain; information gain weight calculation; iteration; multidimensional data; semisupervised clustering; Accuracy; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Data mining; Iris; Machine learning algorithms; K-means; data mining; information gain; semi-supervised;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Science and Engineering (CSE), 2014 IEEE 17th International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4799-7980-6
Type :
conf
DOI :
10.1109/CSE.2014.358
Filename :
7023870
Link To Document :
بازگشت