Title :
Locality-centric thread scheduling for bulk-synchronous programming models on CPU architectures
Author :
Hee-Seok Kim ; El Hajj, Izzat ; Stratton, John ; Lumetta, Steven ; Wen-mei Hwu
Abstract :
With heterogeneous computing on the rise, executing programs efficiently on different devices from a single source code has become increasingly important. OpenCL, having a bulk-synchronous programming model, has been proposed as a framework for writing such performance-portable programs. Execution order of work-items in a program is unconstrained except at barrier synchronization events, giving some freedom to an implementation when scheduling work-items between synchronization points. Many OpenCL (and CUDA) compilers have been designed for targeting multicore CPU architectures. However, scheduling work-items in prior work has been done with primary focus on correctness and vectorization. To the best of our knowledge, no existing implementations consider the impact of work-item scheduling on data locality. We propose an OpenCL compiler that performs data-locality-centric work-item scheduling. By analyzing the memory addresses accessed in loops within a kernel, our technique can make better decisions on how to schedule work-items to construct better memory access patterns, thereby improving performance. Our approach achieves geomean speedups of 3.32× over AMD´s and 1.71 × over Intel´s implementations on Parboil and Rodinia benchmarks.
Keywords :
microprocessor chips; multi-threading; parallel architectures; scheduling; AMD; CPU architectures; CUDA compilers; OpenCL compiler; Parboil benchmarks; Rodinia benchmarks; bulk synchronous programming models; data locality; geomean speedups; locality centric thread scheduling; multicore CPU architectures; performance portable programs; single source code; work-item scheduling; Benchmark testing; Computer architecture; Indexes; Kernel; Schedules; Scheduling; Synchronization;
Conference_Titel :
Code Generation and Optimization (CGO), 2015 IEEE/ACM International Symposium on
Conference_Location :
San Francisco, CA
DOI :
10.1109/CGO.2015.7054205