• DocumentCode
    3024705
  • Title

    An Initialization Method for Clustering High-Dimensional Data

  • Author

    Chen, Luying ; Chen, Lifei ; Jiang, Qingshan ; Wang, Beizhan ; Shi, Liang

  • Author_Institution
    Software Sch., Xiamen Univ., Xiamen, China
  • fYear
    2009
  • fDate
    25-26 April 2009
  • Firstpage
    444
  • Lastpage
    447
  • Abstract
    In iterative refinement clustering algorithms, such as the various types of K-Means algorithms, the clustering results are very sensitive to the initial cluster centers. Conventional initialization methods tend to loss effectiveness due to the so-called "curse of dimensionality" when clustering high-dimensional data. In this paper, a local density based method is proposed to search for initial cluster centers on high-dimensional data. We define the probability density of a point as the amount of its highly similar neighborhoods with weight coefficient. Points with high density neighborhoods and low similarity are chosen as the initial cluster centers. Experimental results on real world datasets show the effectiveness of the proposed method.
  • Keywords
    data handling; iterative methods; pattern clustering; probability; K-means algorithm; cluster center searching; curse of dimensionality; density neighborhood; high-dimensional data clustering; initialization method; iterative refinement clustering algorithm; local density based method; probability density; weight coefficient; Application software; Clustering algorithms; Computer science; Data mining; Databases; Iterative algorithms; Iterative methods; Loss measurement; Optimization methods; Software algorithms; K-Means type clustering; cluster center initialization; data mining; high-dimensional clustering; neighborhoods based density;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database Technology and Applications, 2009 First International Workshop on
  • Conference_Location
    Wuhan, Hubei
  • Print_ISBN
    978-0-7695-3604-0
  • Type

    conf

  • DOI
    10.1109/DBTA.2009.87
  • Filename
    5207723