DocumentCode
1362869
Title
The parallelization of video processing
Author
Lin, Dennis ; Huang, Xiaohuang Victor ; Nguyen, Quang ; Blackburn, Joshua ; Rodrigues, Christopher ; Huang, Thomas ; Do, Minh N. ; Patel, Sanjay J. ; Hwu, Wen-Mei W.
Volume
26
Issue
6
fYear
2009
fDate
11/1/2009 12:00:00 AM
Firstpage
103
Lastpage
112
Abstract
In this article, we focus on the applicability of parallel computing architectures to video processing applications. We demonstrate different optimization strategies in detail using the 3-D convolution problem as an example, and show how they affect performance on both many-core CPUs and symmetric multiprocessor CPUs. Applying these strategies to case studies from three video processing domains brings out some trends. The highly uniform, abundant parallelism in many video processing kernels means that they are well suited to a simple, massively parallel task-based model such as CUDA. As a result, we often see ten times or greater performances increases running on many-core hardware. Some kernels, however, push the limits of CUDA, because their memory accesses cannot be shaped into regular, vectorizable patterns or because they cannot be efficiently decomposed into small independent tasks. Such kernels, like the depth propagation kernel in the section "Synthesis Example: Depth Image-Based Rendering" may achieve a modest speedup, but they are probably better suited to a more flexible parallel programming model. We look forward to additional advances, as more researchers learn to harness the processing capabilities of the latest generation of computation hardware.
Keywords
multiprocessing systems; parallel programming; rendering (computer graphics); video signal processing; 3D convolution problem; computer processing unit; depth image-based rendering; depth propagation kernel; many-core CPU; many-core hardware; optimization; parallel computing architecture; parallel programming model; symmetric multiprocessor CPU; video processing kernels; video processing parallelization; Bandwidth; Concurrent computing; Explosives; Internet; Multicore processing; Parallel processing; Parallel programming; Signal synthesis; Video compression; Video sharing;
fLanguage
English
Journal_Title
Signal Processing Magazine, IEEE
Publisher
ieee
ISSN
1053-5888
Type
jour
DOI
10.1109/MSP.2009.934116
Filename
5230809
Link To Document