مرکز منطقه ای اطلاع رساني علوم و فناوري - BigKernel -- High Performance CPU-GPU Communication Pipelining for Big Data-Style Applications

DocumentCode :

1783320

Title :

BigKernel -- High Performance CPU-GPU Communication Pipelining for Big Data-Style Applications

Author :

Mokhtari, Reza ; Stumm, Michael

Author_Institution :

Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON, Canada

fYear :

2014

fDate :

19-23 May 2014

Firstpage :

819

Lastpage :

828

Abstract :

GPUs offer an order of magnitude higher compute power and memory bandwidth than CPUs. GPUs therefore might appear to be well suited to accelerate computations that operate on voluminous data sets in independent ways, e.g., for transformations, filtering, aggregation, partitioning or other "Big Data" style processing. Yet experience indicates that it is difficult, and often error-prone, to write GPGPU programs which efficiently process data that does not fit in GPU memory, partly because of the intricacies of GPU hardware architecture and programming models, and partly because of the limited bandwidth available between GPUs and CPUs. In this paper, we propose Big Kernel, a scheme that provides pseudo-virtual memory to GPU applications and is implemented using a 4-stage pipeline with automated prefetching to (i) optimize CPU-GPU communication and (ii) optimize GPU memory accesses. Big Kernel simplifies the programming model by allowing programmers to write kernels using arbitrarily large data structures that can be partitioned into segments where each segment is operated on independently, these kernels are transformed into Big Kernel using straight-forward compiler transformations. Our evaluation on six data-intensive benchmarks shows that Big Kernel achieves an average speedup of 1.7 over state-of-the-art double-buffering techniques and an average speedup of 3.0 over corresponding multi-threaded CPU implementations.

Keywords :

Big Data; data structures; graphics processing units; pipeline processing; program compilers; storage management; Big Data-style processing; BigKernel scheme; GPGPU programs; GPU hardware architecture; GPU memory accesses; GPU programming models; automated prefetching; data structures; double-buffering techniques; high performance CPU-GPU communication pipelining; magnitude higher compute power; memory bandwidth; multithreaded CPU; pseudovirtual memory; straight-forward compiler transformations; voluminous data sets; Arrays; Graphics processing units; Kernel; Memory management; Pipelines; Prefetching; CPU; GPU; communication; optimization; stream processing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium, 2014 IEEE 28th International

Conference_Location :

Phoenix, AZ

ISSN :

1530-2075

Print_ISBN :

978-1-4799-3799-8

Type :

conf

DOI :

10.1109/IPDPS.2014.89

Filename :

6877313

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1783320