Title :
An Initialization Method for Clustering High-Dimensional Data
Author :
Chen, Luying ; Chen, Lifei ; Jiang, Qingshan ; Wang, Beizhan ; Shi, Liang
Author_Institution :
Software Sch., Xiamen Univ., Xiamen, China
Abstract :
In iterative refinement clustering algorithms, such as the various types of K-Means algorithms, the clustering results are very sensitive to the initial cluster centers. Conventional initialization methods tend to loss effectiveness due to the so-called "curse of dimensionality" when clustering high-dimensional data. In this paper, a local density based method is proposed to search for initial cluster centers on high-dimensional data. We define the probability density of a point as the amount of its highly similar neighborhoods with weight coefficient. Points with high density neighborhoods and low similarity are chosen as the initial cluster centers. Experimental results on real world datasets show the effectiveness of the proposed method.
Keywords :
data handling; iterative methods; pattern clustering; probability; K-means algorithm; cluster center searching; curse of dimensionality; density neighborhood; high-dimensional data clustering; initialization method; iterative refinement clustering algorithm; local density based method; probability density; weight coefficient; Application software; Clustering algorithms; Computer science; Data mining; Databases; Iterative algorithms; Iterative methods; Loss measurement; Optimization methods; Software algorithms; K-Means type clustering; cluster center initialization; data mining; high-dimensional clustering; neighborhoods based density;
Conference_Titel :
Database Technology and Applications, 2009 First International Workshop on
Conference_Location :
Wuhan, Hubei
Print_ISBN :
978-0-7695-3604-0
DOI :
10.1109/DBTA.2009.87