DocumentCode
1631997
Title
A new data clustering approach for data mining in large databases
Author
Tsai, Cheng-Fa ; Wu, Han-Chang ; Tsai, Chun-Wei
Author_Institution
Dept. of Manage. Inf. Syst., Nat. Pingtung Univ. of Sci. & Technol., Taiwan
fYear
2002
fDate
6/24/1905 12:00:00 AM
Firstpage
278
Lastpage
283
Abstract
Clustering is the unsupervised classification of patterns (data item, feature vectors, or observations) into groups (clusters). Clustering in data mining is very useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric-based similarity measure in order to partition the database such that data points in the same partition are more similar than points in different partitions. In this paper, we present a new data clustering method for data mining in large databases. Our simulation results show that the proposed novel clustering method performs better than a fast self-organizing map (FSOM) combined with the k-means approach (FSOM+k-means) and the genetic k-means algorithm (GKA). In addition, in all the cases we studied, our method produces much smaller errors than both the FSOM+k-means approach and GKA
Keywords
data mining; genetic algorithms; pattern clustering; self-organising feature maps; very large databases; FSOM+k-means approach; ant system; data clustering method; data distribution pattern discovery; data item; data mining; database partitioning; distance metric-based similarity measure; errors; fast self-organizing map; feature vectors; genetic k-means algorithm; large databases; observations; similar data points; simulation; unsupervised pattern classification; Clustering algorithms; Clustering methods; Data mining; Extraterrestrial measurements; Feedback; Iterative algorithms; Partitioning algorithms; Prototypes; Shape measurement; Spatial databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Architectures, Algorithms and Networks, 2002. I-SPAN '02. Proceedings. International Symposium on
Conference_Location
Makati City, Metro Manila
ISSN
1087-4089
Print_ISBN
0-7695-1579-7
Type
conf
DOI
10.1109/ISPAN.2002.1004300
Filename
1004300
Link To Document