• DocumentCode
    3226745
  • Title

    Optimized Deep Learning Architectures with Fast Matrix Operation Kernels on Parallel Platform

  • Author

    Ying Zhang ; Saizheng Zhang

  • Author_Institution
    Dept. of Autom., Univ. of Sci. & Technol. of China, Hefei, China
  • fYear
    2013
  • fDate
    4-6 Nov. 2013
  • Firstpage
    71
  • Lastpage
    78
  • Abstract
    In this paper, we introduce an optimized deep learning architecture with flexible layer structures and fast matrix operation kernels on parallel computing platform (e.g. NVIDIA´s GPU). Carefully designed layer-wise strategies are conducted to integrate different kinds of deep architectures into a uniform neural training-testing system. Our fast matrix operation kernels are implemented in deep architecture´s propagation processes. In our experiment, these kernels save 70% time on average comparing with the kernels in NVIDIA´s CUBLAS library (widely used by many other neural network toolkits), and help our parallel deep architecture beats the neural structures using CUBLAS kernels in practical problems.
  • Keywords
    graphics processing units; learning (artificial intelligence); matrix algebra; neural nets; parallel architectures; parallel programming; NVIDIA CUBLAS library; NVIDIA GPU; deep architecture propagation process; fast matrix operation kernels; flexible layer structures; layer-wise strategies; neural network toolkits; neural structure; neural training-testing system; optimized deep learning architecture; parallel computing platform; parallel deep architecture; Computer architecture; Graphics processing units; Integrated circuits; Kernel; Libraries; Training; Vectors; GPU; deep architecture; deep learning; kernel; matrix operation; parallel computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence (ICTAI), 2013 IEEE 25th International Conference on
  • Conference_Location
    Herndon, VA
  • ISSN
    1082-3409
  • Print_ISBN
    978-1-4799-2971-9
  • Type

    conf

  • DOI
    10.1109/ICTAI.2013.21
  • Filename
    6735232