• DocumentCode
    2025737
  • Title

    3D recursive Gaussian IIR on GPU and FPGAs — A case for accelerating bandwidth-bounded applications

  • Author

    Cong, Jason ; Huang, Muhuan ; Zou, Yi

  • Author_Institution
    Comput. Sci. Dept., Univ. of California, Los Angeles, CA, USA
  • fYear
    2011
  • fDate
    5-6 June 2011
  • Firstpage
    70
  • Lastpage
    73
  • Abstract
    GPU device typically has a higher off-chip bandwidth than FPGA-based systems. Thus typically GPU should perform better for bandwidth-bounded massive parallel applications. In this paper, we present our implementations of a 3D recursive Gaussian IIR on multi-core CPU, many-core GPU and multi-FPGA platforms. Our baseline implementation on the CPU features the smallest arithmetic computation (2 MADDs per dimension). While this application is clearly bandwidth bounded, the difference on the memory subsystems translates to different bandwidth optimization techniques. Our implementations on the GPU and FPGA platforms show 26X and 33X speedup respectively over optimized single-thread code on CPU.
  • Keywords
    Gaussian processes; field programmable gate arrays; recursive filters; 3D recursive Gaussian IIR; FPGA; GPU; accelerating bandwidth-bounded application; bandwidth optimization technique; memory subsystems; multicore CPU; Bandwidth; Convolution; Field programmable gate arrays; Graphics processing unit; Instruction sets; Smoothing methods; Three dimensional displays;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Application Specific Processors (SASP), 2011 IEEE 9th Symposium on
  • Conference_Location
    San Diego, CA
  • Print_ISBN
    978-1-4577-1212-8
  • Type

    conf

  • DOI
    10.1109/SASP.2011.5941081
  • Filename
    5941081