DocumentCode :
72444
Title :
A 1.22 TOPS and 1.52 mW/MHz Augmented Reality Multicore Processor With Neural Network NoC for HMD Applications
Author :
Gyeonghoon Kim ; Kyuho Lee ; Youchang Kim ; Seongwook Park ; Injoon Hong ; Kyeongryeol Bong ; Hoi-Jun Yoo
Author_Institution :
Dept. of Electr. Eng., KAIST, Daejeon, South Korea
Volume :
50
Issue :
1
fYear :
2015
fDate :
Jan. 2015
Firstpage :
113
Lastpage :
124
Abstract :
Real-time augmented reality (AR) is actively studied for the future user interface and experience in high-performance head-mounted display (HMD) systems. The small battery size and limited computing power of the current HMD, however, fail to implement the real-time markerless AR in the HMD. In this paper, we propose a real-time and low-power AR processor for advanced 3D-AR HMD applications. For the high throughput, the processor adopts task-level pipelined SIMD-PE clusters and a congestion-aware network-on-chip (NoC). Both of these two features exploit the high data-level parallelism (DLP) and task-level parallelism (TLP) with the pipelined multicore architecture. For the low power consumption, it employs a vocabulary forest accelerator and a mixed-mode support vector machine (SVM)-based DVFS control to reduce unnecessary external memory accesses and core activation. The proposed 4 mm × 8 mm HMD AR processor is fabricated using 65 nm CMOS technology for a battery-powered HMD platform with real-time AR operation. It consumes 381 mW average power and 778 mW peak power at 250 MHz operating frequency and 1.2 V supply voltage. It achieves 1.22 TOPS peak performance and 1.57 TOPS/W energy efficiency, which are, respectively, 3.58 × and 1.18 × higher than the state of the art.
Keywords :
CMOS digital integrated circuits; augmented reality; helmet mounted displays; low-power electronics; multiprocessing systems; network-on-chip; neural chips; support vector machines; 3D-AR HMD applications; CMOS technology; DLP; SVM-based DVFS control; TLP; TOPS; augmented reality multicore processor; battery-powered HMD platform; congestion-aware network-on-chip; core activation; dynamic voltage and frequency scaling control; frequency 250 MHz; high data-level parallelism; high-performance head-mounted display system; limited computing power; low power consumption; low-power AR processor; mixed-mode support vector machine; neural network NoC; pipelined multicore architecture; real-time markerless AR; size 65 nm; small battery size; task-level parallelism; task-level pipelined SIMD-PE clusters; tera operation per second performance; user interface; vocabulary forest accelerator; voltage 1.2 V; Cameras; Feature extraction; Object recognition; Parallel processing; Real-time systems; Throughput; Visualization; 2D-mesh network-on-chip; AR processor architecture; Augmented reality (AR); congestion-aware task assignment; heterogeneous SIMD multicore architecture;
fLanguage :
English
Journal_Title :
Solid-State Circuits, IEEE Journal of
Publisher :
ieee
ISSN :
0018-9200
Type :
jour
DOI :
10.1109/JSSC.2014.2352303
Filename :
6899706
Link To Document :
بازگشت