• DocumentCode
    189225
  • Title

    Data Clustering Using Topological Features

  • Author

    Pereira, Cassio M. M. ; De Mello, Rodrigo F.

  • Author_Institution
    Inst. of Math. & Comput. Sci., Sao Carlos, Brazil
  • fYear
    2014
  • fDate
    18-22 Oct. 2014
  • Firstpage
    360
  • Lastpage
    365
  • Abstract
    Clustering is one of the most used data mining techniques, while computational topology is a very recent field bridging abstract mathematics with concrete computational techniques. In this paper, we explore the hypothesis that topologically-similar clusters may indicate meaningful relationships. Our approach has an efficient implementation based on computing Minimum Spanning Trees to obtain topological information of each cluster. We then compute a discreteness and a disconnectedness index, used to characterize each cluster, thus allowing the retrieval of equivalence classes. We show that for a real-world high-dimensional network intrusion data set, the topologically-similar clusters retrieved by our approach do indeed correspond to meaningful equivalence classes present in the data set.
  • Keywords
    data mining; pattern clustering; security of data; trees (mathematics); cluster topological information; computational techniques; computational topology; data clustering; data mining techniques; disconnectedness index; discreteness index; equivalence class retrieval; high-dimensional network intrusion data set; minimum spanning trees; topological features; Clustering algorithms; Data mining; Extraterrestrial measurements; Feature extraction; Indexes; Indium phosphide; Topology; clustering; topological features; topology;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems (BRACIS), 2014 Brazilian Conference on
  • Conference_Location
    Sao Paulo
  • Type

    conf

  • DOI
    10.1109/BRACIS.2014.71
  • Filename
    6984857