Title :
3D recursive Gaussian IIR on GPU and FPGAs — A case for accelerating bandwidth-bounded applications
Author :
Cong, Jason ; Huang, Muhuan ; Zou, Yi
Author_Institution :
Comput. Sci. Dept., Univ. of California, Los Angeles, CA, USA
Abstract :
GPU device typically has a higher off-chip bandwidth than FPGA-based systems. Thus typically GPU should perform better for bandwidth-bounded massive parallel applications. In this paper, we present our implementations of a 3D recursive Gaussian IIR on multi-core CPU, many-core GPU and multi-FPGA platforms. Our baseline implementation on the CPU features the smallest arithmetic computation (2 MADDs per dimension). While this application is clearly bandwidth bounded, the difference on the memory subsystems translates to different bandwidth optimization techniques. Our implementations on the GPU and FPGA platforms show 26X and 33X speedup respectively over optimized single-thread code on CPU.
Keywords :
Gaussian processes; field programmable gate arrays; recursive filters; 3D recursive Gaussian IIR; FPGA; GPU; accelerating bandwidth-bounded application; bandwidth optimization technique; memory subsystems; multicore CPU; Bandwidth; Convolution; Field programmable gate arrays; Graphics processing unit; Instruction sets; Smoothing methods; Three dimensional displays;
Conference_Titel :
Application Specific Processors (SASP), 2011 IEEE 9th Symposium on
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4577-1212-8
DOI :
10.1109/SASP.2011.5941081