• DocumentCode
    2006490
  • Title

    Farthest Centroids Divisive Clustering

  • Author

    Fang, Haw-ren ; Saad, Y.

  • Author_Institution
    Math & Comp. Sci. Div., Argonne Nat. Lab., Argonne, IL, USA
  • fYear
    2008
  • fDate
    11-13 Dec. 2008
  • Firstpage
    232
  • Lastpage
    238
  • Abstract
    A method is presented to partition a given set of data entries embedded in Euclidean space by recursively bisecting clusters into smaller ones. The initial set is subdivided into two subsets whose centroids are farthest from each other, and the process is repeated recursively on each subset. An approximate algorithm is proposed to solve the original integer programming problem which is NP-hard. Experimental evidence shows that the clustering method often outperforms a standard spectral clustering method, albeit at a slightly higher computational cost. The paper also discusses improvements of the standard K-means algorithm. Specifically, the clustering quality resulting from the K-means technique can be significantly enhanced by using the proposed algorithm for its initialization.
  • Keywords
    approximation theory; computational complexity; integer programming; pattern clustering; K-means algorithm; NP-hard; approximate algorithm; centroid divisive clustering; integer programming problem; Clustering algorithms; Clustering methods; Computational efficiency; Laboratories; Linear programming; Machine learning; Partitioning algorithms; Proteins; Sequences; Traveling salesman problems; Lanczos method; farthest centroids; spectral bisection; unsupervised clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
  • Conference_Location
    San Diego, CA
  • Print_ISBN
    978-0-7695-3495-4
  • Type

    conf

  • DOI
    10.1109/ICMLA.2008.141
  • Filename
    4724980