Title :
CCF: Fast and scalable connected component computation in MapReduce
Author :
Kardes, Hakan ; Agrawal, Sanjay ; Xin Wang ; Ang Sun
Author_Institution :
Data Res., Inome Inc., Bellevue, WA, USA
Abstract :
Finding connected components in a graph is a well-known problem in a wide variety of application areas such as social network analysis, data mining, image processing, and etc. In this paper, we present an efficient and scalable approach in MapReduce to find all the connected components in a given graph. We compare our approach with the state-of-the-art on a real-world graph. We also demonstrate the viability of our approach on a massive graph with ~6B nodes and ~92B edges on an 80-node hadoop cluster. To the best of our knowledge, this is the largest graph publicly used in such an experiment.
Keywords :
data mining; network theory (graphs); parallel algorithms; MapReduce; connected component computation; massive graph; node Hadoop cluster; real-world graph; Algorithm design and analysis; Cleaning; Couplings; Data mining; Databases; Feature extraction; Social network services; Connected Components; Hadoop; Large Scale Graphs; MapReduce; Transitive Closure;
Conference_Titel :
Computing, Networking and Communications (ICNC), 2014 International Conference on
Conference_Location :
Honolulu, HI
DOI :
10.1109/ICCNC.2014.6785473