• DocumentCode
    2454776
  • Title

    Autonomous Clustering Characterization for Categorical Data

  • Author

    Grozavu, Nistor ; Labiod, Lazhar ; Bennani, Younès

  • Author_Institution
    LIPN, Univ. Paris 13, Villetaneuse, France
  • fYear
    2010
  • fDate
    12-14 Dec. 2010
  • Firstpage
    607
  • Lastpage
    613
  • Abstract
    This paper addresses the problem of cluster characterization by selecting a subset of the most relevant features for each cluster from a categorical dataset in an autonomous way. The proposed autonomous model is based on the Relational Topological Clustering (RTC) associated with a statistical test which allows to detect the most important variables in an automatic way without setting any parameters. The RTC approach is used to build a prototypes matrix which contains continuous variables, where each prototype vector represents correlated categorical data. Thereafter, the statistical ScreeTest is used to detect relevant and correlated features (or modalities) for each prototype. The proposed method requires simple computational techniques and the RTC topology technique is based on the principle of the self-organizing map (SOM) model. This method allows the dimensionality reduction, visualization and cluster characterization simultaneously. Empirical results based on real datasets from the UCI repository, are given and discussed.
  • Keywords
    category theory; data reduction; data visualisation; feature extraction; pattern clustering; self-organising feature maps; statistical testing; unsupervised learning; autonomous clustering characterization; categorical data; cluster characterization; dimensionality reduction; feature detection; prototype matrix; prototype vector; relational topological clustering; self-organizing map; statistical ScreeTest; statistical test; unsupervised learning; visualization; Acceleration; Algorithm design and analysis; Clustering algorithms; Eigenvalues and eigenfunctions; Machine learning; Neurons; Prototypes; autonomous learning; feature selection; relational clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on
  • Conference_Location
    Washington, DC
  • Print_ISBN
    978-1-4244-9211-4
  • Type

    conf

  • DOI
    10.1109/ICMLA.2010.94
  • Filename
    5708893