Evaluation of GPU Architectures Using Spiking Neural Networks

Author

Pallipuram, Vivek K. ; Bhuiyan, Mansurul A. ; Smith, Malcolm C.

Author_Institution

Dept. of Electr. & Comput. Eng., Clemson Univ., Clemson, SC, USA

fYear

2011

fDate

19-21 July 2011

Firstpage

93

Lastpage

102

Abstract

During recent years General-Purpose Graphical Processing Units (GP-GPUs) have entered the field of High-Performance Computing (HPC) as one of the primary architectural focuses for many research groups working with complex scientific applications. Nvidia´s Tesla C2050, codenamed Fermi, and AMD´s Radeon 5870 are two devices positioned to meet the computationally demanding needs of supercomputing research groups across the globe. Though Nvidia GPUs powered by CUDA have been the frequent choices of the performance centric research groups, the introduction and growth of OpenCL has promoted AMD GP-GPUs as potential accelerator candidates that can challenge Nvidia´s stronghold. These architectures not only offer a plethora of features for application developers to explore, but their radically different architectures calls for a detailed study that weighs their merits and evaluates their potential to accelerate complex scientific applications. In this paper, we present our performance analysis research comparing Nvidia´s Fermi and AMD´s Radeon 5870 using OpenCL as the common programming model. We have chosen four different neuron models for Spiking Neural Networks (SNNs), each with different communication and computation requirements, namely the Izhikevich, Wilson, Morris Lecar (ML), and the Hodgkin Huxley (HH) models. We compare the runtime performance of the Fermi and Radeon GPUs with an implementation that exhausts all optimization techniques available with OpenCL. Several equivalent architectural parameters of the two GPUs are studied and correlated with the application performance. In addition to the comparative study effort, our implementations were able to achieve a speed-up of 857.3x and 658.51x on the Fermi and Radeon architectures respectively for the most compute intensive HH model with a dense network containing 9.72 million neurons. The final outcome of this research is a detailed architectural comparison of the two GPU architectures with a common programm- - ing platform.

Keywords

computer graphic equipment; coprocessors; neural nets; AMD Radeon 5870; CUDA; Fermi; GPU architectures; Hodgkin Huxley models; Morris Lecar models; Nvidia Tesla C2050; OpenCL; application developers; general-purpose graphical processing units; high-performance computing; optimization techniques; performance centric research groups; programming model; spiking neural networks; supercomputing research groups; Computational modeling; Computer architecture; Firing; Graphics processing unit; Mathematical model; Neurons; Optimization; AMD; Fermi; GPU Architecture Comparison; OpenCL; Profiler Counters; SNNs; speed-up;

fLanguage

English

Publisher

ieee

Conference_Titel

Application Accelerators in High-Performance Computing (SAAHPC), 2011 Symposium on

Conference_Location

Knoxville, TN

Print_ISBN

978-1-4577-0635-6

Electronic_ISBN

978-0-7695-4448-9

Type

conf

DOI

10.1109/SAAHPC.2011.20

Filename

6031572