Title :
Automatic generation of implementations for DSP transforms on fused multiply-add architectures
Author :
Voronenko, Yevgen ; Püschel, Markus
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
Many modern computer architectures feature fused multiply-add (FMA) instructions, which offer potentially faster performance for numerical applications. For DSP transforms, compilers can only generate FMA code to a very limited extent because optimal use of FMAs requires modifying the chosen algorithm. In this paper, we present a framework for automatically generating FMA code for every linear DSP transform, which we implemented as an extension to the SPIRAL code generation system. We show that for many transforms and transform sizes, our generated FMA code matches the best-known hand-derived FMA algorithms in terms of arithmetic cost. Further, we present actual runtime results that show the speed-up obtained by using FMA instructions.
Keywords :
automatic programming; digital arithmetic; discrete Fourier transforms; discrete cosine transforms; program compilers; DCT; DFT; DSP transform implementations; FMA code arithmetic cost; FMA instructions; SPIRAL code generation system; automatic code generation; compilers; fused multiply-add architectures; linear DSP transform; Arithmetic; Computer architecture; Costs; Digital signal processing; Discrete Fourier transforms; Discrete cosine transforms; Discrete transforms; Runtime; Signal processing algorithms; Spirals;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1327057