DocumentCode :
3183790
Title :
Multilevel Granularity Parallelism Synthesis on FPGAs
Author :
Papakonstantinou, Alexandros ; Liang, Yun ; Stratton, John A. ; Gururaj, Karthik ; Chen, Deming ; Hwu, Wen-Mei W. ; Cong, Jason
Author_Institution :
Electr. & Comput. Eng. Dept., Univ. of Illinois, Urbana, IL, USA
fYear :
2011
fDate :
1-3 May 2011
Firstpage :
178
Lastpage :
185
Abstract :
Recent progress in High-Level Synthesis (HLS) techniques has helped raise the abstraction level of FPGA programming. However implementation and performance evaluation of the HLS-generated RTL, involves lengthy logic synthesis and physical design flows. Moreover, mapping of different levels of coarse grained parallelism onto hardware spatial parallelism affects the final FPGA-based performance both in terms of cycles and frequency. Evaluation of the rich design space through the full implementation flow - starting with high level source code and ending with routed net list - is prohibitive in various scientific and computing domains, thus hindering the adoption of reconfigurable computing. This work presents a framework for multilevel granularity parallelism exploration with HLS-order of efficiency. Our framework considers different granularities of parallelism for mapping CUDA kernels onto high performance FPGA-based accelerators. We leverage resource and clock period models to estimate the impact of multi-granularity parallelism extraction on execution cycles and frequency. The proposed Multilevel Granularity Parallelism Synthesis (ML-GPS) framework employs an efficient design space search heuristic in tandem with the estimation models as well as design layout information to derive a performance near-optimal configuration. Our experimental results demonstrate that ML-GPS can efficiently identify and generate CUDA kernel configurations that can significantly outperform previous related tools whereas it can offer competitive performance compared to software kernel execution on GPUs at a fraction of the energy cost.
Keywords :
field programmable gate arrays; integrated circuit layout; logic design; CUDA kernel mapping; FPGA programming; FPGA-based accelerator; abstraction level; coarse grained parallelism; design layout information; design space search heuristic; hardware spatial parallelism; high-level synthesis technique; lengthy logic synthesis; multigranularity parallelism extraction; multilevel granularity parallelism synthesis; performance evaluation; physical design flow; reconfigurable computing; Arrays; Clocks; Estimation; Field programmable gate arrays; Instruction sets; Kernel; Parallel processing; Design Space Exploration; FPGA; High-Level Sytnthesis; Parallel Computing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Field-Programmable Custom Computing Machines (FCCM), 2011 IEEE 19th Annual International Symposium on
Conference_Location :
Salt Lake City, UT
Print_ISBN :
978-1-61284-277-6
Electronic_ISBN :
978-0-7695-4301-7
Type :
conf
DOI :
10.1109/FCCM.2011.29
Filename :
5771270
Link To Document :
بازگشت