• DocumentCode
    3739237
  • Title

    Large Data Clustering Using Quadratic Programming: A Comprehensive Quantitative Analysis

  • Author

    Alireza Chakeri;Lawrence O. Hall

  • Author_Institution
    Comput. Sci. &
  • fYear
    2015
  • Firstpage
    806
  • Lastpage
    813
  • Abstract
    We address the space complexity challenge in large graph clustering using quadratic programming, and present a comprehensive technical analysis and alternative solution based on game theoretic concepts. We develop an approximate solution to the problem of clustering graphs with a large number of vertices in order to overcome the space complexity issues. Particularly, the edge weights between every pair of vertices are required which proves practically intractable for large data sets. Our scalable method divides a graph into disjoint tractable size subgraphs, where their clusters are enumerated based on a novel solution space search. Then, the clusters obtained in each subgraph are grouped using a low resolution ensemble clustering method. The exact maxima of the quadratic programming problem on the entire graph is approximated by the maxima of the subsets of the graph. Finally, vertices are assigned to the final clusters using a linear game theoretic relation. We also propose the question "How can a cluster of a subset of a dataset be a cluster of the entire dataset?". We show that, in the quadratic programming framework, this problem is coNP-hard. Hence, we modify the definition of a cluster from a stable concept to a non-stable but optimal one (Nash equilibrium) that makes it computationally practical to find clusters in graphs with large numbers of vertices. On the Berkeley Segmentation Dataset, the proposed method achieves results comparable to the state of the art, providing a parallel framework for image segmentation.
  • Keywords
    "Games","Nash equilibrium","Sociology","Statistics","Quadratic programming","Symmetric matrices"
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshop (ICDMW), 2015 IEEE International Conference on
  • Electronic_ISBN
    2375-9259
  • Type

    conf

  • DOI
    10.1109/ICDMW.2015.151
  • Filename
    7395751