DocumentCode :
34682
Title :
The Fastest Fourier Transform in the South
Author :
Blake, Anthony M. ; Witten, I.H. ; Cree, Michael J.
Author_Institution :
Dept. of Comput. Sci., Univ. of Waikato, Hamilton, New Zealand
Volume :
61
Issue :
19
fYear :
2013
fDate :
Oct.1, 2013
Firstpage :
4707
Lastpage :
4716
Abstract :
This paper describes FFTS, a discrete Fourier transform (DFT) library that achieves state-of-the-art performance using a new cache-oblivious algorithm implemented with run-time specialization. During initialization the transform parameters, such as sign and direction, are fixed, allowing sequencing and data access information to be precomputed and specialized machine code to be generated. The resulting code may be executed any number of times on different input data. In contrast to FFTW, FFTS does not use large base cases at the leaves of recursion, nor a substantial library of codelets, and does not require time-consuming machine-specific calibration. The code presented in this paper has been benchmarked on recent Intel x86 and ARM machines, and is, in almost all cases, faster than self-tuning libraries such as FFTW, and even vendor-tuned libraries such as Intel IPP and Apple vDSP.
Keywords :
cache storage; discrete Fourier transforms; program compilers; software libraries; ARM machines; Apple vDSP; DFT library; FFTS; FFTW; Intel IPP; Intel x86; cache-oblivious algorithm; codelets; data access information; discrete Fourier transform; dynamic code generation; fastest Fourier transform; run-time specialization; self-tuning libraries; specialized machine code; transform parameters; vendor-tuned libraries; Discrete Fourier transforms; Libraries; Prediction algorithms; Runtime; Signal processing algorithms; Dynamic code generation; FFT; Fourier transform; run-time specialization;
fLanguage :
English
Journal_Title :
Signal Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1053-587X
Type :
jour
DOI :
10.1109/TSP.2013.2273199
Filename :
6557535
Link To Document :
بازگشت