Title :
A 320 mW 342 GOPS Real-Time Dynamic Object Recognition Processor for HD 720p Video Streams
Author :
Jinwook Oh ; Gyeonghoon Kim ; Junyoung Park ; Injoon Hong ; Seungjin Lee ; Joo-Young Kim ; Jeong-Ho Woo ; Hoi-Jun Yoo
Author_Institution :
Dept. of Electr. Eng., Korea Adv. Inst. of Sci. & Technol. (KAIST), Daejeon, South Korea
Abstract :
A heterogeneous multi-core processor is proposed to achieve real-time dynamic object recognition on HD 720p video streams. The context-aware visual attention model is proposed to reduce the required computing power for HD object recognition based on enhanced attention accuracy. In order to realize real-time execution of the proposed algorithm, the processor adopts a 5-stage task-level pipeline that maximizes the utilization of its 31 heterogeneous cores, comprising four simultaneous multithreading feature extraction clusters, a cache-based feature matching processor and a machine learning engine. Dynamic resource management is applied to adaptively tune thread allocation and power management during execution based on the detected amount of tasks and hardware utilization to increase energy efficiency. As a result, the 32 mm2 chip, fabricated in 0.13 μm CMOS technology, achieves 30 frame/sec with 342 8-bit GOPS peak performance and 320 mW average power dissipation, which are a 2.72 times performance improvement and 2.54 times per-pixel energy reduction compared to the previous state-of-the-art.
Keywords :
CMOS integrated circuits; cache storage; feature extraction; image matching; learning (artificial intelligence); multi-threading; multiprocessing systems; object recognition; real-time systems; video streaming; 5-stage task-level pipeline; CMOS technology; GOPS peak performance; GOPS real-time dynamic object recognition processor; HD 720p video stream; HD object recognition; attention accuracy; average power dissipation; cache-based feature matching processor; computing power; context-aware visual attention model; dynamic resource management; energy efficiency; energy reduction; hardware utilization; heterogeneous multicore processor; machine learning engine; multithreading feature extraction cluster; power 320 mW; power management; size 0.13 mum; storage capacity 8 bit; thread allocation; Bandwidth; Computer architecture; Feature extraction; High definition video; Object recognition; Pipelines; System-on-a-chip; Multi-core processor; dynamic resource management; dynamic voltage and frequency scaling; heterogeneous; low power processor; object recognition; scale invariant feature transform;
Journal_Title :
Solid-State Circuits, IEEE Journal of
DOI :
10.1109/JSSC.2012.2220651