• DocumentCode
    3691863
  • Title

    A CGRA-Based Approach for Accelerating Convolutional Neural Networks

  • Author

    Masakazu Tanomoto;Shinya Takamaeda-Yamazaki;Jun Yao;Yasuhiko Nakashima

  • Author_Institution
    Grad. Sch. of Inf. Sci., Nara Inst. of Sci. &
  • fYear
    2015
  • Firstpage
    73
  • Lastpage
    80
  • Abstract
    Convolutional neural network (CNN) is an emerging approach for achieving high recognition accuracy in various machine learning applications. To accelerate CNN computations, various GPU-based or application-specific hardware approaches have been recently proposed. However, since they require large computing hardware and absolute energy amount, they are not suitable for embedded applications. In this paper, we propose a novel approach to accelerate CNN computations using a CGRA (Coarse Grained Reconfigurable Architecture) for low-power embedded systems. We first present a new CGRA with distributed scratchpad memory blocks for efficient temporal blocking to reduce memory bandwidth pressure. We then show the architecture of our CNN accelerator using the CGRA with some dedicated software implementation. We evaluated our approach by comparing some existing platforms, such as high-end and mobile GPUs, and general multicore CPUs. The evaluation result shows that our proposal achieves 1.93x higher performance per memory bandwidth and 2.92x higher area performance, respectively.
  • Keywords
    "Convolution","Bandwidth","Hardware","Machine learning","Acceleration","Neural networks","Arrays"
  • Publisher
    ieee
  • Conference_Titel
    Embedded Multicore/Many-core Systems-on-Chip (MCSoC), 2015 IEEE 9th International Symposium on
  • Type

    conf

  • DOI
    10.1109/MCSoC.2015.41
  • Filename
    7328189