DocumentCode
2144907
Title
Optimizing remote accesses for offloaded kernels: Application to high-level synthesis for FPGA
Author
Alias, Christophe ; Darte, Alain ; Plesco, Alexandru
Author_Institution
Compsys, LIP, UMR 5668 CNRS, INRIA, ENS-Lyon, UCB-Lyon, France
fYear
2013
fDate
18-22 March 2013
Firstpage
575
Lastpage
580
Abstract
Some data- and compute-intensive applications can be accelerated by offloading portions of codes to platforms such as GPGPUs or FPGAs. However, to get high performance for these kernels, it is mandatory to restructure the application, to generate adequate communication mechanisms for the transfer of remote data, and to make good usage of the memory bandwidth. In the context of the high-level synthesis (HLS), from a C program, of hardware accelerators on FPGA, we show how to automatically generate optimized remote accesses for an accelerator communicating to an external DDR memory. Loop tiling is used to enable block communications, suitable for DDR memories. Pipelined communication processes are generated to overlap communications and computations, thereby hiding some latencies, in a way similar to double buffering. Finally, not only intra-tile but also inter-tile data reuse is exploited to avoid remote accesses when data are already available in the local memory. Our first contribution is to show how to generate the sets of data to be read from (resp. written to) the external memory just before (resp. after) each tile so as to reduce communications and reuse data as much as possible in the accelerator. The main difficulty arises when some data may be (re)defined in the accelerator and should be kept locally. Our second contribution is an optimized code generation scheme, entirely at source-level, i.e., in C, that allows us to compile all the necessary glue (the communication processes) with the same HLS tool as for the computation kernel. Both contributions use advanced polyhedral techniques for program analysis and transformation. Experiments with Altera HLS tools demonstrate how to use our techniques to efficiently map C kernels to FPGA.
Keywords
Acceleration; Indexes;
fLanguage
English
Publisher
ieee
Conference_Titel
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013
Conference_Location
Grenoble, France
ISSN
1530-1591
Print_ISBN
978-1-4673-5071-6
Type
conf
DOI
10.7873/DATE.2013.127
Filename
6513573
Link To Document