Title :
Clustering by attraction and distraction
Author :
Chongstitvatana, Jaruloj ; Thubtimdang, Wanwara
Author_Institution :
Dept. of Math., Chulalongkorn Univ., Bangkok, Thailand
Abstract :
Clustering is data analysis which aims to group similar objects together while separating them from dissimilar objects. Centroid-based clustering methods create clusters of objects in the shape of hyper-sphere, and thus cannot create clusters correctly when similar objects do not form a hyper-sphere. This work proposes an agglomerative clustering method using the concept of attraction and distraction. Attraction is measured by the number of similar object pairs in two clusters and the size of the two clusters. Distraction is the possibility that there are other possible cluster pairs to be merged. The proposed algorithm is evaluated against K-means algorithm, and it is found that it gives higher accuracy then K-means algorithm on iris and Haberman survival datasets, lower accuracy on breast cancer and SPECT heart test datasets, and comparable accuracy on wine dataset.
Keywords :
data analysis; pattern clustering; Haberman survival datasets; K-means algorithm; SPECT heart test datasets; agglomerative clustering method; attraction; breast cancer datasets; centroid-based clustering methods; data analysis; distraction; iris survival datasets; wine dataset; agglomerative clustering; cluster analysis; clustering; unsupervised classification;
Conference_Titel :
Computer Science and Software Engineering (JCSSE), 2011 Eighth International Joint Conference on
Conference_Location :
Nakhon Pathom
Print_ISBN :
978-1-4577-0686-8
DOI :
10.1109/JCSSE.2011.5930149