DocumentCode
389624
Title
MSGKA: an efficient clustering algorithm for large databases
Author
Tsai, Cheng-Fa ; Chen, Zhi-Cheng ; Tsai, Chun-Wei
Author_Institution
Dept. of Manage. Inf. Syst., Nat. Pingtung Univ. of Sci. & Technol., Taiwan
Volume
5
fYear
2002
fDate
6-9 Oct. 2002
Abstract
This investigation presents an efficient clustering algorithm for large databases. We present a novel multiple-searching genetic algorithm (MSGA) that finds a globally optimal partition of a given data into a specified number of clusters. We hybridize MSGA with a multiple-searching approach utilized in clustering namely, K-means algorithm. Hence, the name multiple-searching genetic K-means algorithm (MSGKA). Our simulation results reveal that the proposed novel clustering approach performs better than the Fast SOM combines K-means approach (FSOM+K-means) and Genetic K-Means Algorithm (GKA). Moreover, in all the cases we studied, our approach produces much smaller errors than both the FSOM+K-means and GKA.
Keywords
data mining; database theory; genetic algorithms; pattern clustering; search problems; very large databases; Fast SOM combined K-means approach; Genetic K-Means Algorithm; K-means algorithm; MSGKA; clustering algorithm; data mining; errors; large databases; multiple-searching genetic K-means algorithm; multiple-searching genetic algorithm; simulation; Biological cells; Clustering algorithms; Costs; Data mining; Databases; Genetic algorithms; Machine learning; Partitioning algorithms; Pattern recognition; Statistics;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2002 IEEE International Conference on
ISSN
1062-922X
Print_ISBN
0-7803-7437-1
Type
conf
DOI
10.1109/ICSMC.2002.1176400
Filename
1176400
Link To Document