DocumentCode
3470691
Title
Porting Optimized GPU Kernels to a Multi-core CPU: Computational Quantum Chemistry Application Example
Author
Dong Ye ; Titov, Andrii ; Kindratenko, V. ; Ufimtsev, I. ; Martinez, Thierry
Author_Institution
Nat. Center for Supercomput. Applic., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
fYear
2011
fDate
19-21 July 2011
Firstpage
72
Lastpage
75
Abstract
We investigate techniques for optimizing a multi-core CPU code back ported from a highly optimized GPU kernel. We show that common sub-expression elimination and loop unrolling optimization techniques improve code performance on the GPU, but not on the CPU. On the other hand, register reuse and loop merging are effective on the CPU and in combination they improve performance of the ported code by 16%.
Keywords
computer graphic equipment; coprocessors; merging; multiprocessing systems; optimisation; highly optimized GPU kernel; loop merging; loop unrolling optimization techniques; multicore CPU code; porting; register reuse; sub-expression elimination; Central Processing Unit; Chemistry; Graphics processing unit; Instruction sets; Kernel; Optimization; Registers; GPU; OpenMP; common sub-expression elimination; loop merging; lop unrolling; register reuse;
fLanguage
English
Publisher
ieee
Conference_Titel
Application Accelerators in High-Performance Computing (SAAHPC), 2011 Symposium on
Conference_Location
Knoxville, TN
Print_ISBN
978-1-4577-0635-6
Electronic_ISBN
978-0-7695-4448-9
Type
conf
DOI
10.1109/SAAHPC.2011.8
Filename
6031568
Link To Document