Author_Institution :
Machine Intell. Lab., Sichuan Univ., Chengdu, China
Abstract :
In this paper, we address two problems in Sparse Subspace Clustering algorithm (SSC), i.e., scalability issue and out-of-sample problem. SSC constructs a sparse similarity graph for spectral clustering by using l1-minimization based coefficients, has achieved state-of-the-art results for image clustering and motion segmentation. However, the time complexity of SSC is proportion to the cubic of problem size such that it is inefficient to apply SSC into large scale setting. Moreover, SSC does not handle with out-of-sample data that are not used to construct the similarity graph. For each new datum, SSC needs recalculating the cluster membership of the whole data set, which makes SSC is not competitive in fast online clustering. To address the problems, this paper proposes out-of-sample extension of SSC, named as Scalable Sparse Subspace Clustering (SSSC), which makes SSC feasible to cluster large scale data sets. The solution of SSSC adopts a "sampling, clustering, coding, and classifying" strategy. Extensive experimental results on several popular data sets demonstrate the effectiveness and efficiency of our method comparing with the state-of-the-art algorithms.
Keywords :
computational complexity; graph theory; image classification; image coding; image motion analysis; image sampling; image segmentation; pattern clustering; SSC; cluster membership recalculation; image clustering; l1-minimization; motion segmentation; out-of-sample problem; sampling-clustering-coding-and-classifying strategy; scalability issue; scalable sparse subspace clustering algorithm; sparse similarity graph; spectral clustering; time complexity; Accuracy; Clustering algorithms; Encoding; Kernel; Laplace equations; Scalability; Sparse matrices; Large scale dataset; Sparse similarity graph; spectral clustering; subspace clustering;