DocumentCode :
122377
Title :
Accelerating Spark with RDMA for Big Data Processing: Early Experiences
Author :
Xiaoyi Lu ; Rahman, Md Wasi Ur ; Islam, Nahina ; Shankar, Dipti ; Panda, Dhabaleswar K.
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear :
2014
fDate :
26-28 Aug. 2014
Firstpage :
9
Lastpage :
16
Abstract :
Apache Hadoop Map Reduce has been highly successful in processing large-scale, data-intensive batch applications on commodity clusters. However, for low-latency interactive applications and iterative computations, Apache Spark, an emerging in-memory processing framework, has been stealing the limelight. Recent studies have shown that current generation Big Data frameworks (like Hadoop) cannot efficiently leverage advanced features (e.g. RDMA) on modern clusters with high-performance networks. One of the major bottlenecks is that these middleware are traditionally written with sockets and do not deliver the best performance on modern HPC systems with RDMA-enabled high-performance interconnects. In this paper, we first assess the opportunities of bringing the benefits of RDMA into the Spark framework. We further propose a high-performance RDMA-based design for accelerating data shuffle in the Spark framework on high-performance networks. Performance evaluations show that our proposed design can achieve 79-83% performance improvement for Group By, compared with the default Spark running with IP over Infini Band (IPoIB) FDR on a 128-256 core cluster. We adopt a plug-in-based approach that can make our design to be easily integrated with newer Spark releases. To the best our knowledge, this is the first design for accelerating Spark with RDMA for Big Data processing.
Keywords :
Big Data; storage management; Apache Hadoop Map Reduce; Apache Spark; RDMA; big data processing; in-memory processing framework; middleware; Big data; Engines; Java; Protocols; Servers; Sparks; Throughput; Apache Spark; InfiniBand; RDMA;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High-Performance Interconnects (HOTI), 2014 IEEE 22nd Annual Symposium on
Conference_Location :
Mountain View, CA
Type :
conf
DOI :
10.1109/HOTI.2014.15
Filename :
6925713
Link To Document :
بازگشت