Title :
Spectral clustering with imbalanced data
Author :
Jing Qian ; Saligrama, Venkatesh
Author_Institution :
Boston Univ., Boston, MA, USA
Abstract :
Spectral clustering is sensitive to how graphs are constructed from data. In particular, if the data has proximal and imbalanced clusters, spectral clustering can lead to poor performance on well-known graphs such as k-NN, ϵ-neighborhood and full-RBF graphs. We propose a graph partitioning problem that seeks minimum cut partitions under minimum size constraints on clusters to deal with imbalanced data. Our approach parameterizes a family of graphs by adaptively modulating node degrees on a fixed node set, to yield a set of parameter dependent cuts reflecting varying levels of imbalance. The solution to our problem is then obtained by optimizing over these parameters. We present asymptotic limit cut analysis to justify our approach. Experiments on synthetic and real data sets demonstrate the superiority of our method.
Keywords :
graph theory; pattern clustering; ϵ-neighborhood; adaptively modulating node degrees; asymptotic limit cut analysis; fixed node set; full-RBF graphs; graph construction; graph partitioning problem; imbalanced clusters; imbalanced data; k-NN; minimum cut partitions; minimum size constraints; proximal clusters; spectral clustering; Clustering algorithms; Error analysis; Information processing; Minimization; Moon; Partitioning algorithms; Robustness; Imbalanced Data; RatioCut/Normalized Cut; Spectral Clustering;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854162