• DocumentCode
    3278023
  • Title

    Communication-minimizing 2D convolution in GPU registers

  • Author

    Iandola, Forrest N. ; Sheffield, David ; Anderson, Michael J. ; Phothilimthana, Phitchaya Mangpo ; Keutzer, Kurt

  • Author_Institution
    Parallel Comput. Lab. (ParLab), Univ. of California, Berkeley, Berkeley, CA, USA
  • fYear
    2013
  • fDate
    15-18 Sept. 2013
  • Firstpage
    2116
  • Lastpage
    2120
  • Abstract
    2D image convolution is ubiquitous in image processing and computer vision problems such as feature extraction. Exploiting parallelism is a common strategy for accelerating convolution. Parallel processors keep getting faster, but algorithms such as image convolution remain memory bounded on parallel processors such as GPUs. Therefore, reducing memory communication is fundamental to accelerating image convolution. To reduce memory communication, we reorganize the convolution algorithm to prefetch image regions to register, and we do more work per thread with fewer threads. To enable portability to future architectures, we implement a convolution autotuner that sweeps the design space of memory layouts and loop unrolling configurations. We focus on convolution with small filters (2×2-7×7), but our techniques can be extended to larger filter sizes. Depending on filter size, our speedups on two NVIDIA architectures range from 1.2× to 4.5× over state-of-the-art GPU libraries.
  • Keywords
    computer vision; convolution; graphics processing units; parallel processing; storage management; GPU libraries; GPU registers; NVIDIA architectures; communication-minimizing 2D image convolution; computer vision problems; convolution autotuner; feature extraction; image processing; image region prefetching; loop unrolling configurations; memory communication reduction; memory layout design space; parallel processors; parallelism exploitation; Convolution; GPU; autotuning; parallel;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2013 20th IEEE International Conference on
  • Conference_Location
    Melbourne, VIC
  • Type

    conf

  • DOI
    10.1109/ICIP.2013.6738436
  • Filename
    6738436