DocumentCode :
3295764
Title :
An Adaptive Dynamic Scheduling Scheme for H.264/AVC Decoding on Multicore Architecture
Author :
Dung Vu ; Jilong Kuang ; Bhuyan, Laxmi
Author_Institution :
Comput. Sci. & Eng. Dept., Univ. of California, Riverside, Riverside, CA, USA
fYear :
2012
fDate :
9-13 July 2012
Firstpage :
491
Lastpage :
496
Abstract :
Parallelizing H.264/AVC decoding on multicore architectures is challenged by its inherent structural and functional dependencies at both frame and macro-block levels, as macro-blocks and certain frame types must be decoded in a sequential order. So far, dynamic scheduling scheme with recursive tail submit, as one of the best existing algorithms, provides a good throughput performance by exploiting macro-block level parallelism and mitigating global queue contention. Nevertheless, it fails to achieve an optimal performance due to 1) the use of global queue, which incurs substantial synchronization overhead when the number of cores increases and 2) the unawareness of cache locality with respect to the underlying hierarchical core/cache topology that results in unnecessary latency, communication cost and load imbalance. In this paper, we propose an adaptive dynamic scheduling scheme that employs multiple local queues to reduce lock contention, and assigns tasks in a cache locality aware and load-balancing fashion so that neighboring macro-blocks are preferably dispatched to nearby cores. We design, implement and evaluate our scheme on a 32-core cc-NUMA SGI server. Compared to existing alternatives by running real benchmark applications, we observe that our scheme produces higher throughput and lower latency with more balanced workload and less communication cost.
Keywords :
decoding; resource allocation; scheduling; synchronisation; video coding; H.264/AVC decoding; adaptive dynamic scheduling scheme; cache locality aware; cc-NUMA SGI server; global queue contention mitigation; hierarchical core-cache topology; load-balancing; macroblock level parallelism; multicore architecture; multiple local queues; recursive tail submit; synchronization overhead; Decoding; Dynamic scheduling; Instruction sets; Multicore processing; Parallel processing; Synchronization; Topology; H.264/AVC-decoding; core/cache topology; macro-block level parallelism; multicore architecture;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia and Expo (ICME), 2012 IEEE International Conference on
Conference_Location :
Melbourne, VIC
ISSN :
1945-7871
Print_ISBN :
978-1-4673-1659-0
Type :
conf
DOI :
10.1109/ICME.2012.9
Filename :
6298449
Link To Document :
بازگشت