DocumentCode :
477708
Title :
Research and Application of Improved K-Means Algorithm Based on Fuzzy Feature Selection
Author :
Li, Xiuyun ; Yang, Jie ; Wang, Qing ; Fan, Jinjin ; Liu, Peng
Volume :
1
fYear :
2008
fDate :
18-20 Oct. 2008
Firstpage :
401
Lastpage :
405
Abstract :
K-means is a widely-used clustering algorithm in data mining. In traditional algorithm, each feature is treated equally and each one gives the same contribution to K-means. In fact, redundant and irrelevant features may disturb the clustering result. This paper proposes a improved K-means algorithm based on a fuzzy feature selection strategy. The method is based on measuring ´feature important factor´ (FIF). Firstly, make use of the first time clustering result to get class labels; secondly, set up decision tree to calculate the FIF; thirdly, do the cluster algorithm again with the FIF to modify the similarity measure and then get the modified clustering result. The experiment with UCI datasets proves that, the strategy of fuzzy feature selection can improve the clustering result effectively. At last, the application is done in human resource dataset of a domestic university for further proof of the effectiveness and practicability of the algorithm.
Keywords :
data mining; fuzzy set theory; pattern clustering; statistical analysis; data mining; feature important factor; fuzzy feature selection; improved k-means algorithm; k-means clustering algorithm; Clustering algorithms; Data engineering; Data mining; Decision trees; Finance; Fuzzy set theory; Fuzzy systems; Humans; Information management; Knowledge engineering; Clustering; Fuzzy feature selection; Human resource; K-means;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location :
Shandong
Print_ISBN :
978-0-7695-3305-6
Type :
conf
DOI :
10.1109/FSKD.2008.451
Filename :
4666008
Link To Document :
بازگشت