DocumentCode
1362862
Title
Discrete fourier transform on multicore
Author
Franchetti, Franz ; Püschel, Markus ; Voronenko, Yevgen ; Chellappa, Srinivas ; Moura, José M F
Author_Institution
Electr. & Comput. Eng. (ECE) Dept., Carnegie Mellon Univ., Pittsburgh, PA, USA
Volume
26
Issue
6
fYear
2009
fDate
11/1/2009 12:00:00 AM
Firstpage
90
Lastpage
102
Abstract
This article gives an overview on the techniques needed to implement the discrete Fourier transform (DFT) efficiently on current multicore systems. The focus is on Intel-compatible multicores, but we also discuss the IBM Cell and, briefly, graphics processing units (GPUs). The performance optimization is broken down into three key challenges: parallelization, vectorization, and memory hierarchy optimization. In each case, we use the Kronecker product formalism to formally derive the necessary algorithmic transformations based on a few hardware parameters. Further code-level optimizations are discussed. The rigorous nature of this framework enables the complete automation of the implementation task as shown by the program generator Spiral. Finally, we show and analyze DFT benchmarks of the fastest libraries available for the considered platforms.
Keywords
coprocessors; discrete Fourier transforms; matrix algebra; parallel architectures; performance evaluation; DFT; GPU; IBM Cell; Intel-compatible multicores; Kronecker product formalism; Spiral; code-level optimizations; discrete Fourier transform; graphics processing units; memory hierarchy optimization; multicore performance; multicore systems; program generator; Central Processing Unit; Discrete Fourier transforms; Discrete transforms; Graphics; Instruction sets; Libraries; Multicore processing; Optimization; Signal processing algorithms; Spirals;
fLanguage
English
Journal_Title
Signal Processing Magazine, IEEE
Publisher
ieee
ISSN
1053-5888
Type
jour
DOI
10.1109/MSP.2009.934155
Filename
5230808
Link To Document