• DocumentCode
    3116413
  • Title

    PCA-guided k-Means clustering with incomplete data

  • Author

    Honda, Katsuhiro ; Nonoguchi, Ryoichi ; Notsu, Akira ; Ichihashi, Hidetomo

  • Author_Institution
    Dept. of Comput. Sci. & Intell. Syst., Osaka Prefecture Univ., Sakai, Japan
  • fYear
    2011
  • fDate
    27-30 June 2011
  • Firstpage
    1710
  • Lastpage
    1714
  • Abstract
    This paper considers k-Means clustering of incomplete data sets including missing values. Although the main purpose of k-Means clustering is to partition samples into several homogeneous clusters by minimizing within-cluster errors, it has been shown that a relaxed solution of k-Means can be recovered in a PCA-guided manner. In this paper, the PCA-guided k-Means procedure is extended to a situation in which some observations are missing. Principal component scores, which can be identified with a rotated solution of cluster indicators of k-Means clustering, are estimated in an iterative process without imputation. Besides solving the eigenvalue problem of covariance matrices, k-Means-like partitions are derived through lower rank approximation of the data matrix ignoring missing elements. Several experimental results demonstrate that the PCA-guided process is more robust to initialization problems even though it is based on iterative optimization, just as the k-Means procedure is.
  • Keywords
    approximation theory; covariance matrices; data handling; iterative methods; optimisation; pattern clustering; principal component analysis; set theory; PCA-guided k-means clustering; cluster indicator; covariance matrices; data matrix ignoring missing element; eigenvalue problem; homogeneous cluster; incomplete data sets; iterative optimization; iterative process; lower rank approximation; principal component scores; Clustering algorithms; Helium; Noise; Optimization; Principal component analysis; Prototypes; Robustness; k-means clustering; missing value; principal component analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems (FUZZ), 2011 IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1098-7584
  • Print_ISBN
    978-1-4244-7315-1
  • Electronic_ISBN
    1098-7584
  • Type

    conf

  • DOI
    10.1109/FUZZY.2011.6007312
  • Filename
    6007312