• DocumentCode
    3512899
  • Title

    Generating high performance pruned FFT implementations

  • Author

    Franchetti, Franz ; Püschel, Markus

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    549
  • Lastpage
    552
  • Abstract
    We derive a recursive general-radix pruned Cooley-Tukey fast Fourier transform (FFT) algorithm in Kronecker product notation. The algorithm is compatible with vectorization and parallelization required on state-of-the-art multicore CPUs. We include the pruned FFT algorithm into the program generation system Spiral, and automatically generate optimized implementations of the pruned FFT for the Intel Core2Duo multicore processor. Experimental results show that using the pruned FFT can indeed speed up the fastest available FFT implementations by up to 30% when the problem size and the pattern of unused inputs and outputs are known in advance.
  • Keywords
    fast Fourier transforms; microprocessor chips; multiprocessing systems; Intel Core2Duo multicore processor; Kronecker product notation; general-radix pruned Cooley-Tukey fast Fourier transform; program generation system Spiral; pruned FFT algorithm; Application software; Discrete Fourier transforms; Fast Fourier transforms; Flexible printed circuits; Microprocessors; Multicore processing; Pervasive computing; Signal processing algorithms; Software performance; Spirals; Discrete Fourier transforms; Multiprocessing; Software performance; Vector processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4959642
  • Filename
    4959642