• DocumentCode
    178623
  • Title

    Spectral clustering with imbalanced data

  • Author

    Jing Qian ; Saligrama, Venkatesh

  • Author_Institution
    Boston Univ., Boston, MA, USA
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    3057
  • Lastpage
    3061
  • Abstract
    Spectral clustering is sensitive to how graphs are constructed from data. In particular, if the data has proximal and imbalanced clusters, spectral clustering can lead to poor performance on well-known graphs such as k-NN, ϵ-neighborhood and full-RBF graphs. We propose a graph partitioning problem that seeks minimum cut partitions under minimum size constraints on clusters to deal with imbalanced data. Our approach parameterizes a family of graphs by adaptively modulating node degrees on a fixed node set, to yield a set of parameter dependent cuts reflecting varying levels of imbalance. The solution to our problem is then obtained by optimizing over these parameters. We present asymptotic limit cut analysis to justify our approach. Experiments on synthetic and real data sets demonstrate the superiority of our method.
  • Keywords
    graph theory; pattern clustering; ϵ-neighborhood; adaptively modulating node degrees; asymptotic limit cut analysis; fixed node set; full-RBF graphs; graph construction; graph partitioning problem; imbalanced clusters; imbalanced data; k-NN; minimum cut partitions; minimum size constraints; proximal clusters; spectral clustering; Clustering algorithms; Error analysis; Information processing; Minimization; Moon; Partitioning algorithms; Robustness; Imbalanced Data; RatioCut/Normalized Cut; Spectral Clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854162
  • Filename
    6854162