• DocumentCode
    437518
  • Title

    Fuzzy c-means clustering of mixed databases including numerical and nominal variables

  • Author

    Honda, Katsuhiro ; Ichihashi, Hidetomo

  • Author_Institution
    Graduate Sch. of Eng., Osaka Prefecture Univ., Japan
  • Volume
    1
  • fYear
    2004
  • fDate
    1-3 Dec. 2004
  • Firstpage
    558
  • Abstract
    Fuzzy c-means (FCM) clustering is an unsupervised classification method for revealing intrinsic structure of multi-variate data sets. It is, however, applicable to databases including only numerical variables. For analyzing the intrinsic feature of categorical data sets, many approaches to the quantification of nominal variables have been proposed. Most of them are performed with the goal being to construct combined category quantifications and object scores plots. In this paper, we propose a new approach to the clustering of mixed databases including not only numerical variables but also categorical variables. The clustering technique uses an FCM-type simple iterative algorithm that includes a quantification step. In the quantification step, the category scores are derived so that they suit FCM clustering considering cluster centers and memberships. Numerical experiments demonstrate the characteristic features of the proposed method.
  • Keywords
    fuzzy set theory; pattern clustering; unsupervised learning; categorical data set; clustering technique; fuzzy c-means clustering; intrinsic feature; mixed databases; numerical variable; unsupervised classification method; Clustering algorithms; Data engineering; Data mining; Iterative algorithms; Least squares methods; Loss measurement; Minimization methods; Partitioning algorithms; Prototypes; Spatial databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cybernetics and Intelligent Systems, 2004 IEEE Conference on
  • Print_ISBN
    0-7803-8643-4
  • Type

    conf

  • DOI
    10.1109/ICCIS.2004.1460476
  • Filename
    1460476