• DocumentCode
    169097
  • Title

    Efficient Scan Operator Methods on a GPU

  • Author

    Dieguez, Adrian P. ; Amor, Margarita ; Doallo, Ramon

  • Author_Institution
    Comput. Archit. Group (GAC), Univ. of A Coruna, A Coruna, Spain
  • fYear
    2014
  • fDate
    22-24 Oct. 2014
  • Firstpage
    190
  • Lastpage
    197
  • Abstract
    Current GPUs (Graphics Processing Units) offer high computational power at relatively low cost, nonetheless, this enhanced performance often comes at the expenses of flexibility and code complexity. Efficient GPU programming requires detailed knowledge on certain hardware aspects. The scan operator is an important building block for a wide range of algorithms. In this paper, we present a number of parallel scan methods based on the traditional cyclic reduction tridiagonal solver and the Ladner-Fischer parallel prefix adder. Futhermore, we analyze a set of new features introduced in the Kepler Nvidia architecture such as read-only data cache and shuffle instructions. Our methods provide an excellent performance in many cases, up to 48% improvement over the CUDA Data Parallel Primitives (CUDPP) library.
  • Keywords
    graphics processing units; parallel architectures; GPU programming; Kepler Nvidia architecture; Ladner-Fischer parallel prefix adder; cyclic reduction tridiagonal solver; graphics processing unit; parallel scan method; read-only data cache; scan operator method; shuffle instruction; Arrays; Complexity theory; Graphics processing units; Instruction sets; Kernel; Proposals; Registers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture and High Performance Computing (SBAC-PAD), 2014 IEEE 26th International Symposium on
  • Conference_Location
    Jussieu
  • ISSN
    1550-6533
  • Type

    conf

  • DOI
    10.1109/SBAC-PAD.2014.23
  • Filename
    6970664