DocumentCode :
1772654
Title :
Understanding the design space of DRAM-optimized hardware FFT accelerators
Author :
Akin, Bilal ; Franchetti, F. ; Hoe, James C.
Author_Institution :
ECE Dept., Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear :
2014
fDate :
18-20 June 2014
Firstpage :
248
Lastpage :
255
Abstract :
As technology scaling is reaching its limits, pointing to the well-known memory and power wall problems, achieving high-performance and energy-efficient systems is becoming a significant challenge. Especially for data-intensive computing, efficient utilization of the memory subsystem is the key to achieve high performance and energy efficiency.We address this challenge in DRAM-optimized hardware accelerators for 1D, 2D and 3D fast Fourier transforms (FFT) on large datasets. When the dataset has to be stored in external DRAM, the main challenge for FFT algorithm design lies in reshaping DRAM-unfriendly memory access patterns to eliminate excessive DRAM row buffer misses. More importantly, these algorithms need to be carefully mapped to the targeted platform´s architecture, particularly the memory subsystem, to fully utilize performance and energy efficiency potentials. We use automatic design generation techniques to consider a family of DRAM-optimized FFT algorithms and their hardware implementation design space. In our evaluations, we demonstrate DRAM-optimized accelerator designs over a large tradeoff space given various problem (single/double precision 1D, 2D and 3D FFTs) and hardware platform (off-chip DRAM, 3D-stacked DRAM, ASIC, FPGA, etc.) parameters. We show that generated pareto-optimal designs can yield up to 5.5× energy consumption and order of magnitude memory bandwidth utilization improvements in DRAM, which lead to overall system performance and power efficiency improvements of up to 6× and 6.5× respectively over conventional row-column FFT algorithms.
Keywords :
DRAM chips; fast Fourier transforms; DRAM optimized hardware FFT accelerators; DRAM optimized hardware accelerators; DRAM unfriendly memory access patterns; FFT algorithm design; data intensive computing; fast Fourier transforms; hardware implementation design space; memory subsystem; memory wall problems; power wall problems; technology scaling; Algorithm design and analysis; Bandwidth; Computer architecture; Energy consumption; Hardware; Random access memory; System-on-chip;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Application-specific Systems, Architectures and Processors (ASAP), 2014 IEEE 25th International Conference on
Conference_Location :
Zurich
Type :
conf
DOI :
10.1109/ASAP.2014.6868669
Filename :
6868669
Link To Document :
بازگشت