• DocumentCode
    1362869
  • Title

    The parallelization of video processing

  • Author

    Lin, Dennis ; Huang, Xiaohuang Victor ; Nguyen, Quang ; Blackburn, Joshua ; Rodrigues, Christopher ; Huang, Thomas ; Do, Minh N. ; Patel, Sanjay J. ; Hwu, Wen-Mei W.

  • Volume
    26
  • Issue
    6
  • fYear
    2009
  • fDate
    11/1/2009 12:00:00 AM
  • Firstpage
    103
  • Lastpage
    112
  • Abstract
    In this article, we focus on the applicability of parallel computing architectures to video processing applications. We demonstrate different optimization strategies in detail using the 3-D convolution problem as an example, and show how they affect performance on both many-core CPUs and symmetric multiprocessor CPUs. Applying these strategies to case studies from three video processing domains brings out some trends. The highly uniform, abundant parallelism in many video processing kernels means that they are well suited to a simple, massively parallel task-based model such as CUDA. As a result, we often see ten times or greater performances increases running on many-core hardware. Some kernels, however, push the limits of CUDA, because their memory accesses cannot be shaped into regular, vectorizable patterns or because they cannot be efficiently decomposed into small independent tasks. Such kernels, like the depth propagation kernel in the section "Synthesis Example: Depth Image-Based Rendering" may achieve a modest speedup, but they are probably better suited to a more flexible parallel programming model. We look forward to additional advances, as more researchers learn to harness the processing capabilities of the latest generation of computation hardware.
  • Keywords
    multiprocessing systems; parallel programming; rendering (computer graphics); video signal processing; 3D convolution problem; computer processing unit; depth image-based rendering; depth propagation kernel; many-core CPU; many-core hardware; optimization; parallel computing architecture; parallel programming model; symmetric multiprocessor CPU; video processing kernels; video processing parallelization; Bandwidth; Concurrent computing; Explosives; Internet; Multicore processing; Parallel processing; Parallel programming; Signal synthesis; Video compression; Video sharing;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Magazine, IEEE
  • Publisher
    ieee
  • ISSN
    1053-5888
  • Type

    jour

  • DOI
    10.1109/MSP.2009.934116
  • Filename
    5230809