Title :
The Synchronization Treatment in Implementing Data-Parallel Programming Languages on CPUs
Author :
Feng Yue ; Jianmin Pang ; Rongcai Zhao ; Chao Dai
Author_Institution :
State Key Lab. of Math. Eng. & Adv. Comput., Zhengzhou, China
Abstract :
When implementing data-parallel programming languages such as CUDA, OpenCL on CPUs, synchronization must be simulated correctly. The basic method is thread-based, which means all thread must execute one instruction in turn before execute the next one. In this paper, we propose function splitting to treat synchronization in a co routine style but not just thread-based. It splits the data-parallel function presented by low-level intermediate representation into several parts by simulating synchronization. We evaluate our method in translating PTX kernels to multi-core CPUs, the result of which shows this method could promotes performance by 15% compared to thread-based method. Our main contribution is a generous synchronization treatment that performs on low-level intermediate code given by a control flow graph in SSA form.
Keywords :
data flow graphs; multi-threading; multiprocessing systems; parallel languages; program interpreters; synchronisation; PTX kernels; SSA; control flow graph; coroutine style; data-parallel function; data-parallel programming languages; function splitting; low-level intermediate code; low-level intermediate representation; multicore CPU; static single assignment; synchronization treatment; thread-based method; Algorithm design and analysis; Graphics processing units; Instruction sets; Registers; Switches; Synchronization; Transforms; Data Parallelism; Function Splitting; Low-Level Intermediate Code; SSA; Synchronization; Thread;
Conference_Titel :
High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on
Conference_Location :
Zhangjiajie
DOI :
10.1109/HPCC.and.EUC.2013.275