• DocumentCode
    117289
  • Title

    Energy optimizations for FPGA-based 2-D FFT architecture

  • Author

    Ren Chen ; Prasanna, Viktor K.

  • Author_Institution
    Ming Hsieh Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
  • fYear
    2014
  • fDate
    9-11 Sept. 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Row-column algorithm is commonly used for 2-D FFT implementation on FPGA. However, in this algorithm, the strided memory access to external memory such as DRAM introduces significant delay for DRAM row activation, thus resulting in high DRAM energy and a significant amount of FPGA device energy consumed in idle state. In this paper, to optimize energy consumption of the 2-D FFT architecture, we employ an FPGA-based 1-D FFT kernel supporting processing streaming data to fully utilize the available bandwidth offered by the external memory , and balance the I/O bandwidth between the DRAM and FPGA to minimize the FPGA idle time. Furthermore, to avoid time consuming DRAM row activation, we decompose the required transposition by column-wise FFTs into smaller size problems, thus enabling on-chip local transposition which could be performed by the customized data permutation unit used in the 1-D FFT kernel. Compared with the baseline 2-D FFT architecture, the optimized architecture achieves 3.9×, 4.2× and 4.5× improvement in energy efficiency for 1024×1024, 4096 × 4096 and 8192 × 8192 points 2-D FFTs, respectively. We also estimate the peak energy efficiency of the FPGA-based 2-D FFT architecture. Our estimation shows that our optimized 2-D FFT Kernel can achieve 8.06 ~ 8.31 GFLOPS/W for various 2-D FFTs, ie., up to 62% of the peak energy efficiency of 2-D FFT architecture on FPGA.
  • Keywords
    DRAM chips; fast Fourier transforms; field programmable gate arrays; parallel architectures; 2-D FFT architecture; DRAM; FPGA; column-wise FFT; energy optimization; peak energy efficiency; row-column algorithm; Bandwidth; Computer architecture; Energy consumption; Field programmable gate arrays; Kernel; Random access memory; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Extreme Computing Conference (HPEC), 2014 IEEE
  • Conference_Location
    Waltham, MA
  • Print_ISBN
    978-1-4799-6232-7
  • Type

    conf

  • DOI
    10.1109/HPEC.2014.7040967
  • Filename
    7040967