Title :
Scalable Constrained Spectral Clustering
Author :
Jianyuan Li ; Yingjie Xia ; Zhenyu Shan ; Yuncai Liu
Author_Institution :
Coll. of Comput. Sci., Zhejiang Univ., Hangzhou, China
Abstract :
Constrained spectral clustering (CSC) algorithms have shown great promise in significantly improving clustering accuracy by encoding side information into spectral clustering algorithms. However, existing CSC algorithms are inefficient in handling moderate and large datasets. In this paper, we aim to develop a scalable and efficient CSC algorithm by integrating sparse coding based graph construction into a framework called constrained normalized cuts. To this end, we formulate a scalable constrained normalized-cuts problem and solve it based on a closed-form mathematical analysis. We demonstrate that this problem can be reduced to a generalized eigenvalue problem that can be solved very efficiently. We also describe a principled k-way CSC algorithm for handling moderate and large datasets. Experimental results over benchmark datasets demonstrate that the proposed algorithm is greatly cost-effective, in the sense that (1) with less side information, it can obtain significant improvements in accuracy compared to the unsupervised baseline; (2) with less computational time, it can achieve high clustering accuracies close to those of the state-of-the-art.
Keywords :
eigenvalues and eigenfunctions; graph theory; pattern clustering; benchmark datasets; closed-form mathematical analysis; constrained normalized-cuts problem; constrained spectral clustering; generalized eigenvalue problem; principled k-way CSC algorithm; side information encoding; sparse coding based graph construction; Accuracy; Algorithm design and analysis; Clustering algorithms; Eigenvalues and eigenfunctions; Encoding; Sparse matrices; Vectors; Constrained spectral clustering; efficiency; scalability; sparse coding;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2014.2356471