DocumentCode
477656
Title
A Weight Entropy k-Means Algorithm for Clustering Dataset with Mixed Numeric and Categorical Data
Author
Li, Taoying ; Chen, Yan
Author_Institution
Sch. of Econ. & Manage., Dalian Maritime Univ., Dalian
Volume
1
fYear
2008
fDate
18-20 Oct. 2008
Firstpage
36
Lastpage
41
Abstract
Traditional k-means algorithm can make the distances of objects in the same cluster as small as possible, but the distances of objects from different clusters are not satisfied efficiently and usually the dataset with mixed numeric and categorical data is not classified correctly. The IWEKM (improved weight entropy k-means) algorithm is proposed in this paper. The proposed algorithm overcomes the above problems by modifying the cost function of entropy weighting k-means clustering algorithm by adding a variable that is relevant linearly to the square sum of distances from the mean of all objects and the means of all clusters and a variable that is relevant to relativity degree of categorical data. The results of different clustering algorithms applied on Iris data and Flag data show that the proposed algorithm is efficient.
Keywords
entropy; pattern clustering; Flag data; IWEKM; Iris data; categorical data; cost function; numeric data; weight entropy k-means algorithm; Clustering algorithms; Conference management; Cost function; Entropy; Fuzzy systems; Iris; Knowledge management; Partitioning algorithms; Utility programs; clustering; k-means algorithm; partition clustering; weight entropy;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location
Shandong
Print_ISBN
978-0-7695-3305-6
Type
conf
DOI
10.1109/FSKD.2008.32
Filename
4665935
Link To Document