• DocumentCode
    1994172
  • Title

    An On-chip Heterogeneous Implementation of a General Sparse Linear Solver

  • Author

    Sadrieh, Arash ; Charissis, Stefano ; Hill, A.P.

  • Author_Institution
    Victor Chang Cardiac Res. Inst., Sydney, NSW, Australia
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    54
  • Lastpage
    63
  • Abstract
    Inter-device communication is a common limitation of GPGPU computing methods. The on-chip heterogeneous architecture of a recent class of accelerated processing units (APUs), that combine programmable CPU and GPU cores on the same die, presents an opportunity to address this problem. Here we describe an APU-based heterogeneous implementation of the Jacobi-preconditioned conjugate gradient method and identify a set of optimal configurations based on examination of standard matrices. By leveraging the low-latency memory transactions of the APU and exploiting CPU/GPU cohabitation for concurrent vector operations, a comparable performance to that of a high-end GPU running CUSP is achieved. Our results show that use of on-chip heterogeneous architectures can be attractively cost-effective and even show better performance for applications with a low number of linear solver iterations and when device-to-device data transfer is significant. Accordingly, the APU architecture and associated GPAPU methods have significant potential as a low cost, energy efficient alternative for parallel HPC architectures.
  • Keywords
    Jacobian matrices; concurrency theory; conjugate gradient methods; graphics processing units; mathematics computing; memory architecture; parallel architectures; sparse matrices; APU architecture; CPU; CUSP; GPAPU method; GPGPU computing method; Jacobi-preconditioned conjugate gradient method; accelerated processing unit; concurrent vector operation; device-to-device data transfer; general sparse linear solver iteration; interdevice communication; latency memory transaction; on-chip heterogeneous architecture; parallel HPC architecture; standard matrix; Computer architecture; Data transfer; Graphics processing units; Instruction sets; Random access memory; Sparse matrices; Vectors; APU; GPAPU; HPC; Heterogeneous; Jacobi Conjugate Gradient; Sparse Linear Solver;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
  • Conference_Location
    Cambridge, MA
  • Print_ISBN
    978-0-7695-4979-8
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2013.51
  • Filename
    6650871