Title :
SSDE-Cluster: Fast Overlapping Clustering of Networks Using Sampled Spectral Distance Embedding and GMMs
Author :
Magdon-Ismail, Malik ; Purnell, Jonathan
Author_Institution :
Dept. of Comput. Sci., Rensselaer Polytech. Inst., Troy, NY, USA
Abstract :
Clustering social networks is vital to understanding online interactions and influence. This task becomes more difficult when communities overlap, and when the social networks become extremely large. We present an efficient algorithm for constructing overlapping clusters, (approximately linear). The algorithm first embeds the graph and then performs a metric clustering using a Gaussian Mixture Model (GMM). We evaluate the algorithm on the DBLP paper-paper network which consists of about 1 million nodes and over 30 million edges, we can cluster this network in under 20 minutes on a modest single CPU machine.
Keywords :
Gaussian processes; graph theory; pattern clustering; social networking (online); CPU machine; DBLP paper-paper network; Gaussian mixture model; graph; metric clustering; online interaction; overlapping clustering; sampled spectral distance embedding-cluster; social network clustering; Algorithm design and analysis; Approximation algorithms; Clustering algorithms; Communities; Computer science; Measurement; Social network services;
Conference_Titel :
Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4577-1931-8
DOI :
10.1109/PASSAT/SocialCom.2011.237