Title :
A BSP-Based Parallel Iterative Processing System with Multiple Partition Strategies for Big Graphs
Author :
Zhigang Wang ; Yubin Bao ; Yu Gu ; Fangling Leng ; Ge Yu ; Chao Deng ; Leitao Guo
Author_Institution :
Coll. of Inf. Sci. & Eng., Northeastern Univ., Shenyang, China
fDate :
June 27 2013-July 2 2013
Abstract :
Many applications in real life can produce a large amount of data which can be modeled by a graph. A large graph usually has millions of vertices and billions of edges. This paper presents a BSP-based system, called BC-BSP+, to process large graphs iteratively in parallel. It has the flexibility to configure policies (i.e., disk management parameters) and extend functions (i.e., programming interfaces), to compute large-scale graphs, to tolerate faults, and to balance loads. Especially, three graph partition strategies in BC-BSP+ are proposed to support large graph processing: Randomized Hash Partition (RHP), Balanced Hash Partition (BHP) and Vertex-Cut based on the Range Partition method (VCRP). Lots of experiments are conducted to evaluate BC-BSP+. The experimental results show that the performance of VCRP is better than that of BHP, but the latter is more general. We compare BC-BSP+ with Hadoop, a system based on MapReduce, and the speedup is roughly 8. Moreover, compared with the BSP-based systems, Hama and Giraph, the speedup is also 2 to 6 benefitting from VCRP.
Keywords :
computer graphics; file organisation; parallel processing; BC-BSP+ system; BHP; BSP-based parallel iterative processing system; Giraph system; Hadoop; Hama system; MapReduce; RHP; VCRP; balanced hash partition; big graph; bulk synchronous parallel model; fault tolerance; graph edge; graph partition strategy; graph vertex; load balancing; randomized hash partition; vertex-cut based on the range partition method; Computational modeling; Data handling; Data storage systems; Information management; Partitioning algorithms; Synchronization; Web pages; BSP; MapReduce; graph partition; graph process; load balance;
Conference_Titel :
Big Data (BigData Congress), 2013 IEEE International Congress on
Conference_Location :
Santa Clara, CA
Print_ISBN :
978-0-7695-5006-0
DOI :
10.1109/BigData.Congress.2013.31