Title :
The Fastest Fourier Transform in the South
Author :
Blake, Anthony M. ; Witten, I.H. ; Cree, Michael J.
Author_Institution :
Dept. of Comput. Sci., Univ. of Waikato, Hamilton, New Zealand
Abstract :
This paper describes FFTS, a discrete Fourier transform (DFT) library that achieves state-of-the-art performance using a new cache-oblivious algorithm implemented with run-time specialization. During initialization the transform parameters, such as sign and direction, are fixed, allowing sequencing and data access information to be precomputed and specialized machine code to be generated. The resulting code may be executed any number of times on different input data. In contrast to FFTW, FFTS does not use large base cases at the leaves of recursion, nor a substantial library of codelets, and does not require time-consuming machine-specific calibration. The code presented in this paper has been benchmarked on recent Intel x86 and ARM machines, and is, in almost all cases, faster than self-tuning libraries such as FFTW, and even vendor-tuned libraries such as Intel IPP and Apple vDSP.
Keywords :
cache storage; discrete Fourier transforms; program compilers; software libraries; ARM machines; Apple vDSP; DFT library; FFTS; FFTW; Intel IPP; Intel x86; cache-oblivious algorithm; codelets; data access information; discrete Fourier transform; dynamic code generation; fastest Fourier transform; run-time specialization; self-tuning libraries; specialized machine code; transform parameters; vendor-tuned libraries; Discrete Fourier transforms; Libraries; Prediction algorithms; Runtime; Signal processing algorithms; Dynamic code generation; FFT; Fourier transform; run-time specialization;
Journal_Title :
Signal Processing, IEEE Transactions on
DOI :
10.1109/TSP.2013.2273199