DocumentCode
3024705
Title
An Initialization Method for Clustering High-Dimensional Data
Author
Chen, Luying ; Chen, Lifei ; Jiang, Qingshan ; Wang, Beizhan ; Shi, Liang
Author_Institution
Software Sch., Xiamen Univ., Xiamen, China
fYear
2009
fDate
25-26 April 2009
Firstpage
444
Lastpage
447
Abstract
In iterative refinement clustering algorithms, such as the various types of K-Means algorithms, the clustering results are very sensitive to the initial cluster centers. Conventional initialization methods tend to loss effectiveness due to the so-called "curse of dimensionality" when clustering high-dimensional data. In this paper, a local density based method is proposed to search for initial cluster centers on high-dimensional data. We define the probability density of a point as the amount of its highly similar neighborhoods with weight coefficient. Points with high density neighborhoods and low similarity are chosen as the initial cluster centers. Experimental results on real world datasets show the effectiveness of the proposed method.
Keywords
data handling; iterative methods; pattern clustering; probability; K-means algorithm; cluster center searching; curse of dimensionality; density neighborhood; high-dimensional data clustering; initialization method; iterative refinement clustering algorithm; local density based method; probability density; weight coefficient; Application software; Clustering algorithms; Computer science; Data mining; Databases; Iterative algorithms; Iterative methods; Loss measurement; Optimization methods; Software algorithms; K-Means type clustering; cluster center initialization; data mining; high-dimensional clustering; neighborhoods based density;
fLanguage
English
Publisher
ieee
Conference_Titel
Database Technology and Applications, 2009 First International Workshop on
Conference_Location
Wuhan, Hubei
Print_ISBN
978-0-7695-3604-0
Type
conf
DOI
10.1109/DBTA.2009.87
Filename
5207723
Link To Document