Title :
Improved Energy Efficiency for Multithreaded Kernels through Model-Based Autotuning
Author :
Qasem, Apan ; Cade, Michael Jason ; Tamir, Dan
Author_Institution :
Dept. of Comput. Sci., Texas State Univ., San Marcos, TX, USA
Abstract :
In the last few years, the emergence of multicore architectures has revolutionized the landscape of high-performance computing. The multicore shift has not only increased the per-node performance potential of computer systems but also has made great strides in curbing power and heat dissipation. As we look to the future, however, the gains in performance and energy consumption is not going to come from hardware alone. Software needs to play a key role in achieving a high fraction of peak and keeping the energy consumption within the desired envelope. To attain this goal, performance-enhancing and energy-conserving software needs to carefully orchestrate many architecture-sensitive parameters. In particular, the presence of shared-caches on multicore architectures makes it necessary to consider, in concert, issues related to both parallelism and data locality to achieve the desired power-performance ratio. This paper studies the complex interaction among several code transformations that affect data locality, problem decomposition and selection of loops for parallelism. We characterize this interaction using static compiler analysis and generate a pruned search space suitable for efficient autotuning. We also extend a heuristic based on number of threads, data reuse patterns, and the size and configuration of the shared cache, to estimate good synchronization interval for conserving energy in parallel code. We validate our choice of tuning parameters and evaluate our heuristic with experiments on a set of scientific and engineering kernels on four different multicore platforms. Results of the experimental study reveal several interesting properties of the transformation search space and demonstrate the effectiveness of the heuristic in predicting good synchronization intervals that reduce energy consumption without a significant degradation in performance.
Keywords :
cache storage; cooling; energy conservation; energy consumption; multi-threading; multiprocessing systems; parallel architectures; program compilers; program diagnostics; shared memory systems; synchronisation; tuning; architecture-sensitive parameters; code transformations; complex interaction; computer systems; curbing power; data locality; data reuse patterns; energy consumption; energy efficiency; energy-conserving software; heat dissipation; high-performance computing; model-based autotuning; multicore architectures; multicore shift; multithreaded kernels; parallel code; peak high fraction; power-performance ratio; problem decomposition; search space; shared cache configuration; static compiler analysis; transformation search space; tuning parameters; Computational modeling; Energy consumption; Multicore processing; Parallel processing; Power demand; Synchronization; Tuning;
Conference_Titel :
Green Technologies Conference, 2012 IEEE
Conference_Location :
Tulsa, OK
Print_ISBN :
978-1-4673-0968-4
Electronic_ISBN :
2166-546X
DOI :
10.1109/GREEN.2012.6200963