Title :
Fuzzy c-means clustering of mixed databases including numerical and nominal variables
Author :
Honda, Katsuhiro ; Ichihashi, Hidetomo
Author_Institution :
Graduate Sch. of Eng., Osaka Prefecture Univ., Japan
Abstract :
Fuzzy c-means (FCM) clustering is an unsupervised classification method for revealing intrinsic structure of multi-variate data sets. It is, however, applicable to databases including only numerical variables. For analyzing the intrinsic feature of categorical data sets, many approaches to the quantification of nominal variables have been proposed. Most of them are performed with the goal being to construct combined category quantifications and object scores plots. In this paper, we propose a new approach to the clustering of mixed databases including not only numerical variables but also categorical variables. The clustering technique uses an FCM-type simple iterative algorithm that includes a quantification step. In the quantification step, the category scores are derived so that they suit FCM clustering considering cluster centers and memberships. Numerical experiments demonstrate the characteristic features of the proposed method.
Keywords :
fuzzy set theory; pattern clustering; unsupervised learning; categorical data set; clustering technique; fuzzy c-means clustering; intrinsic feature; mixed databases; numerical variable; unsupervised classification method; Clustering algorithms; Data engineering; Data mining; Iterative algorithms; Least squares methods; Loss measurement; Minimization methods; Partitioning algorithms; Prototypes; Spatial databases;
Conference_Titel :
Cybernetics and Intelligent Systems, 2004 IEEE Conference on
Print_ISBN :
0-7803-8643-4
DOI :
10.1109/ICCIS.2004.1460476