Title :
Vector processor customization for FFT
Author :
Spinean, Bogdan ; Kuzmanov, Georgi ; Gaydadjiev, Georgi
Author_Institution :
Comput. Eng. Lab., Delft Univ. of Technol., Delft, Netherlands
Abstract :
Processors and memory systems suffer from a growing performance gap between them. Each technology generation increases the on-chip performance capabilities however, memory bandwidth increases at a much slower pace. Therefore, overall performance improvements are constrained by the available memory bandwidth. In this paper, we address the memory bandwidth problem of vector processors by introducing hardware customizations which drastically reduce the memory transfers required by the FFT computation. We show that an FFT transform of length equal to the machine size Z can be performed using only O(Z) memory accesses, hence we reduce the memory bandwidth requirement by an order of O(log(Z)) compared to a conventional vector machine. We achieve bandwidth reduction by extending a classic IBM S/370 vector architecture for better register re-use. Our hardware extension completely eliminates the input bit reversal phase of the Cooley-Tukey FFT algorithm. Synthesis results suggest that our extension does not impact the machine cycle time and has a small hardware area overhead of the vector register file of under 4.5% while potentially improving vector performance by a factor of 7.5 for Z = 256.
Keywords :
fast Fourier transforms; mathematics computing; microprocessor chips; Cooley-Tukey FFT algorithm; IBM S-370 vector architecture; O(Z) memory access; fast Fourier transform; hardware customization; memory bandwidth; memory systems; on-chip performance capability; vector processor customization; Arrays; Bandwidth; Indexes; Memory management; Registers; Transforms; Vector processors;
Conference_Titel :
Embedded Computer Systems (SAMOS), 2011 International Conference on
Conference_Location :
Samos
Print_ISBN :
978-1-4577-0802-2
Electronic_ISBN :
978-1-4577-0801-5
DOI :
10.1109/SAMOS.2011.6045451