DocumentCode :
580108
Title :
A design methodology for domain-optimized power-efficient supercomputing
Author :
Mohiyuddin, M. ; Murphy, Michael ; Oliker, Leonid ; Shalf, J. ; Wawrzynek, J. ; Williams, S.
Author_Institution :
EECS Dept., Univ. of California at Berkeley, Berkeley, CA, USA
fYear :
2009
fDate :
14-20 Nov. 2009
Firstpage :
1
Lastpage :
12
Abstract :
As power has become the pre-eminent design constraint for future HPC systems, computational efficiency is being emphasized over simply peak performance. Recently, static benchmark codes have been used to find a power efficient architecture. Unfortunately, because compilers generate sub-optimal code, benchmark performance can be a poor indicator of the performance potential of architecture design points. Therefore, we present hardware/software cotuning as a novel approach for system design, in which traditional architecture space exploration is tightly coupled with software auto-tuning for delivering substantial improvements in area and power efficiency. We demonstrate the proposed methodology by exploring the parameter space of a Tensilica-based multi-processor running three of the most heavily used kernels in scientific computing, each with widely varying micro-architectural requirements: sparse matrix vector multiplication, stencil-based computations, and general matrix-matrix multiplication. Results demonstrate that co-tuning significantly improves hardware area and energy efficiency - a key driver for next generation of HPC system design.
Keywords :
matrix multiplication; multiprocessing systems; parallel architectures; parallel machines; power aware computing; program compilers; sparse matrices; vectors; HPC system design; Tensilica-based multiprocessor; architecture design; architecture space exploration; benchmark performance; compiler; computational efficiency; domain-optimized power-efficient supercomputing; energy efficiency; general matrix-matrix multiplication; hardware-software cotuning; kernels; microarchitectural requirement; performance potential; power efficiency; power efficient architecture; scientific computing; software autotuning; sparse matrix vector multiplication; static benchmark code; stencil-based computation; suboptimal code generation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing Networking, Storage and Analysis, Proceedings of the Conference on
Conference_Location :
Portland, OR
Type :
conf
DOI :
10.1145/1654059.1654072
Filename :
6375557
Link To Document :
بازگشت