• DocumentCode
    1335799
  • Title

    Spectral Embedded Clustering: A Framework for In-Sample and Out-of-Sample Spectral Clustering

  • Author

    Nie, Feiping ; Zeng, Zinan ; Tsang, Ivor W. ; Xu, Dong ; Zhang, Changshui

  • Volume
    22
  • Issue
    11
  • fYear
    2011
  • Firstpage
    1796
  • Lastpage
    1808
  • Abstract
    Spectral clustering (SC) methods have been successfully applied to many real-world applications. The success of these SC methods is largely based on the manifold assumption, namely, that two nearby data points in the high-density region of a low-dimensional data manifold have the same cluster label. However, such an assumption might not always hold on high-dimensional data. When the data do not exhibit a clear low-dimensional manifold structure (e.g., high-dimensional and sparse data), the clustering performance of SC will be degraded and become even worse than K -means clustering. In this paper, motivated by the observation that the true cluster assignment matrix for high-dimensional data can be always embedded in a linear space spanned by the data, we propose the spectral embedded clustering (SEC) framework, in which a linearity regularization is explicitly added into the objective function of SC methods. More importantly, the proposed SEC framework can naturally deal with out-of-sample data. We also present a new Laplacian matrix constructed from a local regression of each pattern and incorporate it into our SEC framework to capture both local and global discriminative information for clustering. Comprehensive experiments on eight real-world high-dimensional datasets demonstrate the effectiveness and advantages of our SEC framework over existing SC methods and K-means-based clustering methods. Our SEC framework significantly outperforms SC using the Nyström algorithm on unseen data.
  • Keywords
    embedded systems; matrix algebra; pattern clustering; regression analysis; set theory; K-means clustering performance; Laplacian matrix; Nystrom algorithm; SC method; SEC framework; cluster assignment matrix; cluster label; high density region; in-sample spectral embedded clustering; linear space; linearity regularization; local regression; low dimensional manifold structure; out-of-sample spectral embedded clustering; real world high dimensional datasets; Clustering algorithms; Clustering methods; Eigenvalues and eigenfunctions; Laplace equations; Manifolds; Matrix decomposition; Optimization; Linearity regularization; out-of-sample clustering; spectral clustering; spectral embedded clustering; Algorithms; Artificial Intelligence; Cluster Analysis; Data Interpretation, Statistical; Linear Models; Pattern Recognition, Automated; Regression Analysis;
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/TNN.2011.2162000
  • Filename
    6030950