• DocumentCode
    3075503
  • Title

    Streaming FFT Asynchronously on Graphics Processor Units

  • Author

    Zhao Lili ; Shengbing, Zhang ; Meng, Zhang ; Yi, Zhang

  • Author_Institution
    Eng. Res. Center of Embedded Syst. Integration, Northwestern Polytech. Univ. (NWPU), Xi´´an, China
  • Volume
    1
  • fYear
    2010
  • fDate
    16-18 July 2010
  • Firstpage
    308
  • Lastpage
    312
  • Abstract
    The Fast Fourier Transform (FFT), which charactered in memory-access-intensive, follows a divide-and-conquer strategy, is one of the most important and heavily used kernel in scientific computing. The newest generation of Graphics Processor Units (GPUs) implement a stream architecture besides acting as powerful massively parallel coprocessor. Fouthermore, the intruduction of APIs for general-purpose computation on GPUs mades GPUs an attractive choice for high-performance numerical and scientific computing. In this work we deal with the implementation of the FFT on a novel NVIDIA GPU, using the CUDA programming model. By optimizing the organiztion of signal data, exploiting the memory hierairchy, and associating the stream to different operations, we efficiently overlap kernel execution and data transfer. Our results indicate a significant performance improvement over GPU-based and CPU-based FFT algorithms. The speedup is 18 percent higher than the original GPU-based on average.
  • Keywords
    application program interfaces; computer graphic equipment; coprocessors; fast Fourier transforms; general purpose computers; parallel programming; API; CUDA programming model; FFT; GPU; divide and conquer strategy; fast Fourier Transform; general purpose computation; graphics processor units; parallel coprocessor; stream architecture; streaming; Graphics; Graphics processing unit; Instruction sets; Kernel; Memory management; Programming; FFT; GPUs; asynchronous communication; stream;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology and Applications (IFITA), 2010 International Forum on
  • Conference_Location
    Kunming
  • Print_ISBN
    978-1-4244-7621-3
  • Electronic_ISBN
    978-1-4244-7622-0
  • Type

    conf

  • DOI
    10.1109/IFITA.2010.76
  • Filename
    5635067