Title :
New distance measure based on the domain for categorical data
Author :
Aranganayagi, S. ; Thangavel, K. ; Sujatha, S.
Author_Institution :
J.K.K. Nataraja Coll. of Arts & Sci., Komarapalayam, India
Abstract :
Clustering the process of grouping homogeneous objects is an important data mining process. Few algorithms exist to cluster categorical data. K-modes is the scalable and efficient algorithm to cluster the categorical data. In this paper we propose a new distance measure for K-modes based on the cardinality of domain of attribute. The proposed method is experimented with data sets obtained from UCI data repository. Results prove that the proposed measure generates better clusters than the K-modes algorithm.
Keywords :
data mining; pattern clustering; K-modes algorithm; UCI data repository; categorical data; cluster categorical data; data clustering process; data mining process; distance measure; Clustering algorithms; Clustering methods; Data engineering; Data mining; Databases; Educational institutions; Fasteners; Frequency measurement; Histograms; Weight measurement;
Conference_Titel :
Advanced Computing, 2009. ICAC 2009. First International Conference on
Conference_Location :
Chennai
Print_ISBN :
978-1-4244-4786-2
Electronic_ISBN :
978-1-4244-4787-9
DOI :
10.1109/ICADVC.2009.5378267