DocumentCode
3674783
Title
A Locality Aware Convolutional Neural Networks Accelerator
Author
Runbin Shi;Zheng Xu;Zhihao Sun;Maurice Peemen;Ang Li;Henk Corporaal;Di Wu
Author_Institution
Dept. of Electron. &
fYear
2015
Firstpage
591
Lastpage
598
Abstract
The advantages of Convolutional Neural Networks (CNNs) with respect to traditional methods for visual pattern recognition have changed the field of machine vision. The main issue that hinders broad adoption of this technique is the massive computing workload in CNN that prevents real-time implementation on low-power embedded platforms. Recently, several dedicated solutions have been proposed to improve the energy efficiency and throughput, nevertheless the huge amount of data transfer involved in the processing is still a challenging issue. This work proposes a new CNN accelerator exploiting a novel memory access scheme which significantly improves data locality in CNN related processing. With this scheme, external memory access is reduced by 50% while achieving similar or even better throughput. The accelerator is implemented using 28nm CMOS technology. Implementation results show that the accelerator achieves a performance of 102GOp/s @800MHz while consuming 0.303mm2 in silicon area. Power simulation shows that the dynamic power of the accelerator is 68mW. Its flexibility is demonstrated by running various different CNN benchmarks.
Keywords
"System-on-chip","Random access memory","Convolution","Feature extraction","Buffer storage","Parallel processing","Registers"
Publisher
ieee
Conference_Titel
Digital System Design (DSD), 2015 Euromicro Conference on
Type
conf
DOI
10.1109/DSD.2015.70
Filename
7302332
Link To Document