• DocumentCode
    2717516
  • Title

    Clustering massive categorical data with class association rules

  • Author

    Berrado, Abdelaziz ; Runger, George

  • Author_Institution
    Al Akhawayn Univ.
  • fYear
    2008
  • fDate
    16-18 Dec. 2008
  • Firstpage
    223
  • Lastpage
    227
  • Abstract
    Clustering algorithms partition data sets into groups of objects such that the pairwise similarity between objects within the same cluster is higher than those assigned to different clusters. Defining a similarity measure becomes challenging in the presence of categorical data and affects the quality and meaningfulness of the clusters formed. Furthermore, the curse of dimensionality diminishes the robustness of such measures. This paper introduces SCAR (supervised clustering with association rules) a nontraditional algorithm for clustering massive high dimensional categorical data. SCAR is robust to the curse of dimensionality, it relies on association rules as an intuitive way to evaluate the similarity between objects and group them.
  • Keywords
    data mining; pattern clustering; SCAR; class association rules; clustering algorithms; clustering massive categorical data; supervised clustering; Association rules; Clustering algorithms; Clustering methods; Entropy; Euclidean distance; Mutual information; Partitioning algorithms; Robustness; Supervised learning; Topology;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovations in Information Technology, 2008. IIT 2008. International Conference on
  • Conference_Location
    Al Ain
  • Print_ISBN
    978-1-4244-3396-4
  • Electronic_ISBN
    978-1-4244-3397-1
  • Type

    conf

  • DOI
    10.1109/INNOVATIONS.2008.4781693
  • Filename
    4781693