مرکز منطقه ای اطلاع رساني علوم و فناوري - Architectural synthesis of computational pipelines with decoupled memory access

DocumentCode :

3585582

Title :

Architectural synthesis of computational pipelines with decoupled memory access

Author :

Shaoyi Cheng ; Wawrzynek, John

Author_Institution :

Dept. of EECS, UC Berkeley, Berkeley, CA, USA

fYear :

2014

Firstpage :

Lastpage :

Abstract :

As high level synthesis (HLS) moves towards mainstream adoption among FPGA designers, it has proven to be an effective method for rapid hardware generation. However, in the context of offloading compute intensive software kernels to FPGA accelerators, current HLS tools do not always take full advantage of the hardware platforms. In this paper, we present an automatic flow to refactor and restructure processor-centric software implementations, making them better suited for FPGA platforms. The methodology generates pipelines that decouple memory operations and data access from computation. The resulting pipelines have much better throughput due to their efficient use of the memory bandwidth and improved tolerance to data access latency. The methodology complements existing work in high-level synthesis, easing the creation of heterogeneous systems with high performance accelerators and general purpose processors. With this approach, for a set of non-regular algorithm kernels written in C, a performance improvement of 3.3 to 9.1x is observed over direct C-to-Hardware mapping using a state-of-the-art HLS tool.

Keywords :

field programmable gate arrays; high level synthesis; logic design; pipeline processing; FPGA accelerators; FPGA designers; FPGA platforms; HLS tools; architectural synthesis; automatic flow; computational pipelines; compute intensive software kernels; data access latency; decoupled memory access; direct C-to-hardware mapping; general purpose processors; hardware platforms; heterogeneous systems; high level synthesis; high performance accelerators; memory bandwidth; memory operations; nonregular algorithm kernels; processor-centric software implementations; rapid hardware generation; Clustering algorithms; Field programmable gate arrays; Hardware; Kernel; Pipeline processing; Program processors; FPGA; Hardware Acceleration; High-level Synthesis; Memory Subsystem Optimization; Memory-level Parallelism; Pipeline Parallelism;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Field-Programmable Technology (FPT), 2014 International Conference on

Print_ISBN :

978-1-4799-6244-0

Type :

conf

DOI :

10.1109/FPT.2014.7082758

Filename :

7082758

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3585582