DocumentCode
2025737
Title
3D recursive Gaussian IIR on GPU and FPGAs — A case for accelerating bandwidth-bounded applications
Author
Cong, Jason ; Huang, Muhuan ; Zou, Yi
Author_Institution
Comput. Sci. Dept., Univ. of California, Los Angeles, CA, USA
fYear
2011
fDate
5-6 June 2011
Firstpage
70
Lastpage
73
Abstract
GPU device typically has a higher off-chip bandwidth than FPGA-based systems. Thus typically GPU should perform better for bandwidth-bounded massive parallel applications. In this paper, we present our implementations of a 3D recursive Gaussian IIR on multi-core CPU, many-core GPU and multi-FPGA platforms. Our baseline implementation on the CPU features the smallest arithmetic computation (2 MADDs per dimension). While this application is clearly bandwidth bounded, the difference on the memory subsystems translates to different bandwidth optimization techniques. Our implementations on the GPU and FPGA platforms show 26X and 33X speedup respectively over optimized single-thread code on CPU.
Keywords
Gaussian processes; field programmable gate arrays; recursive filters; 3D recursive Gaussian IIR; FPGA; GPU; accelerating bandwidth-bounded application; bandwidth optimization technique; memory subsystems; multicore CPU; Bandwidth; Convolution; Field programmable gate arrays; Graphics processing unit; Instruction sets; Smoothing methods; Three dimensional displays;
fLanguage
English
Publisher
ieee
Conference_Titel
Application Specific Processors (SASP), 2011 IEEE 9th Symposium on
Conference_Location
San Diego, CA
Print_ISBN
978-1-4577-1212-8
Type
conf
DOI
10.1109/SASP.2011.5941081
Filename
5941081
Link To Document