• DocumentCode
    1896641
  • Title

    Parallelizing and Optimizing H.264 on Synchronous Data Triggered Architecture

  • Author

    Cong, Liu ; Zhiying, Wang ; Xin, Lai ; Xinbiao, Gan ; Fangyuan, Chen

  • Author_Institution
    Sch. of Comput., Nat. Univ. of Defense Technol., Changsha, China
  • Volume
    2
  • fYear
    2012
  • fDate
    23-25 March 2012
  • Firstpage
    185
  • Lastpage
    190
  • Abstract
    Synchronous data triggered architecture (SDTA) is advantaged with high-performance, flexible scalability, low-cost communication between processor elements. With the prevalence of multimedia nowadays, it is significant to parallelize and optimize the implementation of OpenMAX DL´s video component with SDTA support. H.264 is the core of the video part. In this paper, we propose several optimization techniques for H.264. There are loop unrolling, primitives and vectorization, dependences elimination, batch processing and revision, computing direction transformation, load/store acceleration. With these techniques, we have parallelized all the H.264 APIs, and two representative APIs gain speedups of 5.6, and 6.5 compared to the original sequential ones. Then we propose addition hardware for clip operations. With this hardware, the test benches reduced execution time by 18%, and 38%, respectively, and the speedups are 6.7, and 10.3. We believe that these techniques are especially beneficial to multimedia applications, even other applications on synchronous data triggered architecture.
  • Keywords
    application program interfaces; multimedia systems; optimisation; parallel architectures; video coding; API; H.264 coding; H.264 optimization technique; OpenMAX DL video component; OpenMAX development layer; SDTA support; application program interface; batch processing technique; batch revision technique; clip operation; computing direction transformation; dependences elimination technique; load-store acceleration technique; loop unrolling technique; multimedia application; primitives technique; processor element; synchronous data triggered architecture; vectorization technique; Acceleration; Decoding; Hardware; Image coding; Neurons; Registers; Vectors; H.264; SDTA; SIMD; optimization; parallelization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Electronics Engineering (ICCSEE), 2012 International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4673-0689-8
  • Type

    conf

  • DOI
    10.1109/ICCSEE.2012.288
  • Filename
    6187931