DocumentCode
2205477
Title
HeteroSpark: A heterogeneous CPU/GPU Spark platform for machine learning algorithms
Author
Peilong Li ; Yan Luo ; Ning Zhang ; Yu Cao
Author_Institution
Dept. of Electrical and Computer Engineering, University of Massachusetts Lowell, USA
fYear
2015
fDate
6-7 Aug. 2015
Firstpage
347
Lastpage
348
Abstract
Analytics algorithms on big data sets require tremendous computational capabilities. Spark is a recent development that addresses big data challenges with data and computation distribution and in-memory caching. However, as a CPU only framework, Spark cannot leverage GPUs and a growing set of GPU libraries to achieve better performance and energy efficiency. We present HeteroSpark, a GPU-accelerated heterogeneous architecture integrated with Spark, which combines the massive compute power of GPUs and scalability of CPUs and system memory resources for applications that are both data and compute intensive. We make the following contributions in this work: (1) we integrate the GPU accelerator into current Spark framework to further leverage data parallelism and achieve algorithm acceleration; (2) we provide a plug-n-play design by augmenting Spark platform so that current Spark applications can choose to enable/disable GPU acceleration; (3) application acceleration is transparent to developers, therefore existing Spark applications can be easily ported to this heterogeneous platform without code modifications. The evaluation of HeteroSpark demonstrates up to 18× speedup on a number of machine learning applications.
Keywords
Acceleration; Big data; Computer architecture; Graphics processing units; Libraries; Machine learning algorithms; Sparks;
fLanguage
English
Publisher
ieee
Conference_Titel
Networking, Architecture and Storage (NAS), 2015 IEEE International Conference on
Conference_Location
Boston, MA, USA
Type
conf
DOI
10.1109/NAS.2015.7255222
Filename
7255222
Link To Document