DocumentCode :
2400570
Title :
Efficient ensemble algorithm for mixed numeric and categorical data
Author :
Reddy, M. V Jagannatha ; Kavitha, B.
Author_Institution :
Dept. of CSE, Madanapalle Inst. of Technol. & Sci., Chittoor, India
fYear :
2010
fDate :
28-29 Dec. 2010
Firstpage :
1
Lastpage :
4
Abstract :
Most previous clustering algorithms focus on numerical data whose inherent geometric properties can be exploited naturally to define distance functions between data points. However, much of the data existed in the databases is categorical, where attribute values cannot be naturally ordered as numerical values. Due to the differences in the characteristics of these two kinds of data, attempts to develop criteria functions for mixed data have been not very successful. In this research, we propose a novel divide-and-conquer technique to solve this problem. First, the original mixed dataset is divided into two sub-datasets: the pure categorical dataset and the pure numeric dataset. Next, existing well established clustering algorithms designed for different types of datasets are employed to produce corresponding clusters. Last, the clustering results on the categorical and numeric dataset are combined as a categorical dataset, on which the categorical data clustering algorithm is employed to get the final output. Our main contribution in this research is to provide an algorithm framework for the mixed attributes clustering problem, in which existing clustering algorithms can be easily integrated.
Keywords :
divide and conquer methods; pattern clustering; categorical data; categorical data clustering algorithm; clustering algorithms; divide-and-conquer technique; ensemble algorithm; mixed numeric data; pure categorical dataset; pure numeric dataset; Algorithm design and analysis; Clustering algorithms; Complexity theory; Data mining; Indexes; Machine learning algorithms; Partitioning algorithms; Clustering algorithms; categorical dataset; divide-and-conquer; numerical dataset;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Computing Research (ICCIC), 2010 IEEE International Conference on
Conference_Location :
Coimbatore
Print_ISBN :
978-1-4244-5965-0
Electronic_ISBN :
978-1-4244-5967-4
Type :
conf
DOI :
10.1109/ICCIC.2010.5705738
Filename :
5705738
Link To Document :
بازگشت