• DocumentCode
    3295764
  • Title

    An Adaptive Dynamic Scheduling Scheme for H.264/AVC Decoding on Multicore Architecture

  • Author

    Dung Vu ; Jilong Kuang ; Bhuyan, Laxmi

  • Author_Institution
    Comput. Sci. & Eng. Dept., Univ. of California, Riverside, Riverside, CA, USA
  • fYear
    2012
  • fDate
    9-13 July 2012
  • Firstpage
    491
  • Lastpage
    496
  • Abstract
    Parallelizing H.264/AVC decoding on multicore architectures is challenged by its inherent structural and functional dependencies at both frame and macro-block levels, as macro-blocks and certain frame types must be decoded in a sequential order. So far, dynamic scheduling scheme with recursive tail submit, as one of the best existing algorithms, provides a good throughput performance by exploiting macro-block level parallelism and mitigating global queue contention. Nevertheless, it fails to achieve an optimal performance due to 1) the use of global queue, which incurs substantial synchronization overhead when the number of cores increases and 2) the unawareness of cache locality with respect to the underlying hierarchical core/cache topology that results in unnecessary latency, communication cost and load imbalance. In this paper, we propose an adaptive dynamic scheduling scheme that employs multiple local queues to reduce lock contention, and assigns tasks in a cache locality aware and load-balancing fashion so that neighboring macro-blocks are preferably dispatched to nearby cores. We design, implement and evaluate our scheme on a 32-core cc-NUMA SGI server. Compared to existing alternatives by running real benchmark applications, we observe that our scheme produces higher throughput and lower latency with more balanced workload and less communication cost.
  • Keywords
    decoding; resource allocation; scheduling; synchronisation; video coding; H.264/AVC decoding; adaptive dynamic scheduling scheme; cache locality aware; cc-NUMA SGI server; global queue contention mitigation; hierarchical core-cache topology; load-balancing; macroblock level parallelism; multicore architecture; multiple local queues; recursive tail submit; synchronization overhead; Decoding; Dynamic scheduling; Instruction sets; Multicore processing; Parallel processing; Synchronization; Topology; H.264/AVC-decoding; core/cache topology; macro-block level parallelism; multicore architecture;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo (ICME), 2012 IEEE International Conference on
  • Conference_Location
    Melbourne, VIC
  • ISSN
    1945-7871
  • Print_ISBN
    978-1-4673-1659-0
  • Type

    conf

  • DOI
    10.1109/ICME.2012.9
  • Filename
    6298449