Title :
Parallel mining of closed quasi-cliques
Author :
Zhang, Yuzhou ; Wang, Jianyong ; Zeng, Zhiping ; Zhou, Lizhu
Author_Institution :
Tsinghua Univ., Beijing
Abstract :
Graph structure can model the relationships among a set of objects. Mining quasi-clique patterns from large dense graph data makes sense with respect to both statistic and applications. The applications of frequent quasi-cliques include stock price correlation discovery, gene function prediction and protein molecular analysis. Although the graph mining community has devised many skills to accelerate the discovery process, mining time is always unacceptable, especially on large dense graph data with low support threshold. Therefore, parallel algorithms are desirable on mining quasi-clique patterns. Message passing is one of the most widely used parallel framework. In this paper, we parallelize the state-of-the-art closed quasi-clique mining algorithm called Cocain using message passing. The parallelized version of Cocain can achieve 30+ fold speedup on 32 processors in a cluster of SMPs. The techniques proposed in this work can be applied to parallelize other pattern-growth based frequent pattern mining algorithms.
Keywords :
data mining; message passing; Cocain; closed quasi-cliques; frequent pattern mining algorithms; graph structure; large dense graph data; message passing; mining quasi-clique patterns; parallel mining; Acceleration; Chemicals; Clustering algorithms; Data mining; Databases; Message passing; Parallel algorithms; Proteins; Statistics; Workstations;
Conference_Titel :
Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-1693-6
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2008.4536250