• DocumentCode
    727095
  • Title

    A fast variable block size motion estimation algorithm with refined search range for a two-layer data reuse scheme

  • Author

    Luheng Jia ; Chi-Ying Tsui ; Au, Oscar C. ; Amin Zheng

  • Author_Institution
    Dept. of Electron. & Comput. Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
  • fYear
    2015
  • fDate
    24-27 May 2015
  • Firstpage
    1206
  • Lastpage
    1209
  • Abstract
    Motion estimation (ME) serves as a key tool in a variety of video coding standards. With the increasing need for higher resolution video format, the limited memory bandwidth becomes a bottleneck for ME implementation. The huge data loading from external memory to the on-chip memory and the frequent data fetching from the on-chip memory to the ME engine are two major problems. To reduce both off-chip and on-chip memory bandwidth, we propose a two-layer data reuse scheme. On the macroblock (MB) layer, an advanced Level C data reuse scheme is presented. It employs two cooperating on-chip caches which load data in a novel local-snake scanning manner. On the block layer, we propose a fast variable block size motion estimation with a refined search window (RSW-VBSME). A new approach for hardware implementation of VBSME is then employed based on the fast algorithm. Instead of obtain the SADs of all the modes at the same time, the ME of different block sizes are performed separately. This enables higher data reusability within an MB. The two-layer data reuse scheme archives a more than 90% reduction of off-chip memory bandwidth with a slight increase of on-chip memory size. Moreover, the on-chip memory bandwidth is also greatly reduced compared with other reuse methods with different VBSME implementations.
  • Keywords
    cache storage; microprocessor chips; motion estimation; video coding; RSW-VBSME; advanced Level C data reuse scheme; block layer; cooperating on-chip caches; external memory; fast variable block size motion estimation algorithm; frequent data fetching; huge data loading; limited memory bandwidth; local-snake scanning manner; macroblock layer; off-chip memory bandwidth reduction; on-chip memory bandwidth reduction; refined search range; refined search window; two-layer data reuse scheme; video coding standards; video format; Bandwidth; Loading; Manganese; Motion estimation; Strips; System-on-chip; Video coding; VBSME; data reuse; memory bandwidth;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Circuits and Systems (ISCAS), 2015 IEEE International Symposium on
  • Conference_Location
    Lisbon
  • Type

    conf

  • DOI
    10.1109/ISCAS.2015.7168856
  • Filename
    7168856