DocumentCode :
3537807
Title :
An Instruction-Level Energy Estimation and Optimization Methodology for GPU
Author :
Wang, Yue ; Ranganathan, Nagarajan
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA
fYear :
2011
fDate :
Aug. 31 2011-Sept. 2 2011
Firstpage :
621
Lastpage :
628
Abstract :
Nowadays, GPU architecture is commonly exploited in various researches on computer graphic and other scientific computing areas. Parallel computing feature of GPU provides performance benefits for execution of many programs. However, as the parallel degree keeps extending, the number of active cores in GPU required for execution is also increasing. Therefore the rising of energy consumption caused by using large number of cores begins to draw attention. Previous research [1] reveals that given a multicore program, the curve of energy consumption first falls and then rises, as the number of active cores increases. That means we can have the minimum energy consumption if the number of active cores is properly configured. In this paper, we develop an instruction-level prediction mechanism to estimate the energy consumption of a given program under different numbers of cores. The prediction is based on the profile of Parallel Thread Execution (PTX) [2] codes generated during compilation of the original program. With the help of this mechanism, the energy-optimal number of cores can be found during compilation and used in execution, replacing the one given by programmer. Tests have been carried on several NVIDIA CUDA [10] benchmarks. The results show that the energy consumption is minimized without losing much performance. With the predicted energy-optimal number of active cores, we show that the energy consumption saving for the selected benchmarks is from 7.31% to 11.76% on average, with a worst case of performance lost 4.92%.
Keywords :
computer graphic equipment; coprocessors; parallel architectures; GPU architecture; NVIDIA CUDA benchmarks; instruction-level energy estimation; instruction-level prediction mechanism; minimum energy consumption; optimization methodology; parallel computing; parallel thread execution codes; Benchmark testing; Energy consumption; Energy measurement; Graphics processing unit; Instruction sets; Multicore processing; Registers; GPU; energy; energy estimation; energy optimization; instruction-level; multicore;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Information Technology (CIT), 2011 IEEE 11th International Conference on
Conference_Location :
Pafos
Print_ISBN :
978-1-4577-0383-6
Electronic_ISBN :
978-0-7695-4388-8
Type :
conf
DOI :
10.1109/CIT.2011.69
Filename :
6036835
Link To Document :
بازگشت