Clustering by attraction and distraction

Author

Chongstitvatana, Jaruloj ; Thubtimdang, Wanwara

Author_Institution

Dept. of Math., Chulalongkorn Univ., Bangkok, Thailand

fYear

2011

fDate

11-13 May 2011

Firstpage

368

Lastpage

372

Abstract

Clustering is data analysis which aims to group similar objects together while separating them from dissimilar objects. Centroid-based clustering methods create clusters of objects in the shape of hyper-sphere, and thus cannot create clusters correctly when similar objects do not form a hyper-sphere. This work proposes an agglomerative clustering method using the concept of attraction and distraction. Attraction is measured by the number of similar object pairs in two clusters and the size of the two clusters. Distraction is the possibility that there are other possible cluster pairs to be merged. The proposed algorithm is evaluated against K-means algorithm, and it is found that it gives higher accuracy then K-means algorithm on iris and Haberman survival datasets, lower accuracy on breast cancer and SPECT heart test datasets, and comparable accuracy on wine dataset.

Keywords

data analysis; pattern clustering; Haberman survival datasets; K-means algorithm; SPECT heart test datasets; agglomerative clustering method; attraction; breast cancer datasets; centroid-based clustering methods; data analysis; distraction; iris survival datasets; wine dataset; agglomerative clustering; cluster analysis; clustering; unsupervised classification;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Science and Software Engineering (JCSSE), 2011 Eighth International Joint Conference on

Conference_Location

Nakhon Pathom

Print_ISBN

978-1-4577-0686-8

Type

conf

DOI

10.1109/JCSSE.2011.5930149

Filename

5930149