Title :
Large scale complex network analysis using the hybrid combination of a MapReduce cluster and a highly multithreaded system
Author :
Kang, Seunghwa ; Bader, David A.
Author_Institution :
Georgia Inst. of Technol., Atlanta, GA, USA
Abstract :
Complex networks capture interactions among entities in various application areas in a graph representation. Analyzing large scale complex networks often answers important questions-e.g. estimate the spread of epidemic diseases-but also imposes computing challenges mainly due to large volumes of data and the irregular structure of the graphs. In this paper, we aim to solve such a challenge: finding relationships in a subgraph extracted from the data. We solve this problem using three different platforms: a MapReduce cluster, a highly multithreaded system, and a hybrid system of the two. The MapReduce cluster and the highly multithreaded system reveal limitations in efficiently solving this problem, whereas the hybrid system exploits the strengths of the two in a synergistic way and solves the problem at hand. In particular, once the subgraph is extracted and loaded into memory, the hybrid system analyzes the subgraph five orders of magnitude faster than the MapReduce cluster.
Keywords :
complex networks; graph theory; multi-threading; pattern clustering; MapReduce cluster; graph representation; hybrid system; large scale complex network analysis; multithreaded system; subgraph extraction; Bandwidth; Cloud computing; Complex networks; Computer networks; Data mining; Filtering; Large-scale systems; Power system modeling; Proteins; Sun; cloud computing; parallel algorithms; power-law graphs;
Conference_Titel :
Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4244-6533-0
DOI :
10.1109/IPDPSW.2010.5470691