• DocumentCode
    72444
  • Title

    A 1.22 TOPS and 1.52 mW/MHz Augmented Reality Multicore Processor With Neural Network NoC for HMD Applications

  • Author

    Gyeonghoon Kim ; Kyuho Lee ; Youchang Kim ; Seongwook Park ; Injoon Hong ; Kyeongryeol Bong ; Hoi-Jun Yoo

  • Author_Institution
    Dept. of Electr. Eng., KAIST, Daejeon, South Korea
  • Volume
    50
  • Issue
    1
  • fYear
    2015
  • fDate
    Jan. 2015
  • Firstpage
    113
  • Lastpage
    124
  • Abstract
    Real-time augmented reality (AR) is actively studied for the future user interface and experience in high-performance head-mounted display (HMD) systems. The small battery size and limited computing power of the current HMD, however, fail to implement the real-time markerless AR in the HMD. In this paper, we propose a real-time and low-power AR processor for advanced 3D-AR HMD applications. For the high throughput, the processor adopts task-level pipelined SIMD-PE clusters and a congestion-aware network-on-chip (NoC). Both of these two features exploit the high data-level parallelism (DLP) and task-level parallelism (TLP) with the pipelined multicore architecture. For the low power consumption, it employs a vocabulary forest accelerator and a mixed-mode support vector machine (SVM)-based DVFS control to reduce unnecessary external memory accesses and core activation. The proposed 4 mm × 8 mm HMD AR processor is fabricated using 65 nm CMOS technology for a battery-powered HMD platform with real-time AR operation. It consumes 381 mW average power and 778 mW peak power at 250 MHz operating frequency and 1.2 V supply voltage. It achieves 1.22 TOPS peak performance and 1.57 TOPS/W energy efficiency, which are, respectively, 3.58 × and 1.18 × higher than the state of the art.
  • Keywords
    CMOS digital integrated circuits; augmented reality; helmet mounted displays; low-power electronics; multiprocessing systems; network-on-chip; neural chips; support vector machines; 3D-AR HMD applications; CMOS technology; DLP; SVM-based DVFS control; TLP; TOPS; augmented reality multicore processor; battery-powered HMD platform; congestion-aware network-on-chip; core activation; dynamic voltage and frequency scaling control; frequency 250 MHz; high data-level parallelism; high-performance head-mounted display system; limited computing power; low power consumption; low-power AR processor; mixed-mode support vector machine; neural network NoC; pipelined multicore architecture; real-time markerless AR; size 65 nm; small battery size; task-level parallelism; task-level pipelined SIMD-PE clusters; tera operation per second performance; user interface; vocabulary forest accelerator; voltage 1.2 V; Cameras; Feature extraction; Object recognition; Parallel processing; Real-time systems; Throughput; Visualization; 2D-mesh network-on-chip; AR processor architecture; Augmented reality (AR); congestion-aware task assignment; heterogeneous SIMD multicore architecture;
  • fLanguage
    English
  • Journal_Title
    Solid-State Circuits, IEEE Journal of
  • Publisher
    ieee
  • ISSN
    0018-9200
  • Type

    jour

  • DOI
    10.1109/JSSC.2014.2352303
  • Filename
    6899706