DocumentCode
1132571
Title
Algorithmic and architectural co-design of a motion-estimation engine for low-power video devices
Author
De Vleeschouwer, Christophe ; Nilsson, Tord ; Denolf, Kristof ; Bormans, Jan
Author_Institution
Commun. & Remote Sensing Lab., Univ. Catholique de Louvain, Louvain-la-Neuve, Belgium
Volume
12
Issue
12
fYear
2002
fDate
12/1/2002 12:00:00 AM
Firstpage
1093
Lastpage
1105
Abstract
Due to the large amount of data transfers it involves, the motion estimation (ME) engine is one of the most power-consuming components of any predictive video codec. As a consequence, power-optimized video coding primarily relies on a carefully designed motion estimator. This paper first presents a block ME algorithm that meets high-quality inter-frame prediction and low computational complexity requirements. It relies on a set of rules common to all recent fast and adaptive ME algorithms, but is designed in order to allow for easy and prolific data reuse. The adjacent order of the candidate positions during the search increases the locality and maintains a near-regular data flow, which results in a decrease of the data transfers and a low control complexity. Together with the computational complexity reduction, it enables cost-efficient very large scale integration realizations. A pipelined parallel architecture is then proposed and discussed. It is generic in the sense that it is suited both to the full-pel and half-pel ME. It is efficient because it allows for close to 100% hardware utilization and a sharp decrease of the peak memory bandwidth. It is suited to low-power implementation, as it enables larger data reuse factors for the most probable stages of the adaptive algorithm, which reduces the average memory bandwidth and power consumption.
Keywords
VLSI; computational complexity; hardware-software codesign; motion estimation; parallel architectures; pipeline processing; video codecs; video coding; VLSI; adaptive ME algorithms; adaptive algorithm; algorithmic co-design; architectural co-design; block ME algorithm; computational complexity reduction; data reuse; fast ME algorithms; full-pel ME; half-pel ME; hardware utilization; interframe prediction; low computational complexity; low control complexity; low-power implementation; low-power video devices; motion-estimation engine; near-regular data flow; peak memory bandwidth; pipelined parallel architecture; power consumption; power-optimized video coding; predictive video codec; very large scale integration; Algorithm design and analysis; Bandwidth; Computational complexity; Engines; Hardware; Motion estimation; Parallel architectures; Very large scale integration; Video codecs; Video coding;
fLanguage
English
Journal_Title
Circuits and Systems for Video Technology, IEEE Transactions on
Publisher
ieee
ISSN
1051-8215
Type
jour
DOI
10.1109/TCSVT.2002.806810
Filename
1175446
Link To Document