DocumentCode
189225
Title
Data Clustering Using Topological Features
Author
Pereira, Cassio M. M. ; De Mello, Rodrigo F.
Author_Institution
Inst. of Math. & Comput. Sci., Sao Carlos, Brazil
fYear
2014
fDate
18-22 Oct. 2014
Firstpage
360
Lastpage
365
Abstract
Clustering is one of the most used data mining techniques, while computational topology is a very recent field bridging abstract mathematics with concrete computational techniques. In this paper, we explore the hypothesis that topologically-similar clusters may indicate meaningful relationships. Our approach has an efficient implementation based on computing Minimum Spanning Trees to obtain topological information of each cluster. We then compute a discreteness and a disconnectedness index, used to characterize each cluster, thus allowing the retrieval of equivalence classes. We show that for a real-world high-dimensional network intrusion data set, the topologically-similar clusters retrieved by our approach do indeed correspond to meaningful equivalence classes present in the data set.
Keywords
data mining; pattern clustering; security of data; trees (mathematics); cluster topological information; computational techniques; computational topology; data clustering; data mining techniques; disconnectedness index; discreteness index; equivalence class retrieval; high-dimensional network intrusion data set; minimum spanning trees; topological features; Clustering algorithms; Data mining; Extraterrestrial measurements; Feature extraction; Indexes; Indium phosphide; Topology; clustering; topological features; topology;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems (BRACIS), 2014 Brazilian Conference on
Conference_Location
Sao Paulo
Type
conf
DOI
10.1109/BRACIS.2014.71
Filename
6984857
Link To Document