• DocumentCode
    505974
  • Title

    Efficient gather and scatter operations on graphics processors

  • Author

    He, Bingsheng ; Govindaraju, Naga K. ; Luo, Qiong ; Smith, Burton

  • Author_Institution
    Hong Kong Univ. of Science and Technology
  • fYear
    2007
  • fDate
    10-16 Nov. 2007
  • Firstpage
    1
  • Lastpage
    12
  • Abstract
    Gather and scatter are two fundamental data-parallel operations, where a large number of data items are read (gathered) from or are written (scattered) to given locations. In this paper, we study these two operations on graphics processing units (GPUs). With superior computing power and high memory bandwidth, GPUs have become a commodity multiprocessor platform for general-purpose high-performance computing. However, due to the random access nature of gather and scatter, a naive implementation of the two operations suffers from a low utilization of the memory bandwidth and consequently a long, unhidden memory latency. Additionally, the architectural details of the GPUs, in particular, the memory hierarchy design, are unclear to the programmers. Therefore, we design multi-pass gather and scatter operations to improve their data access locality, and develop a performance model to help understand and optimize these two operations. We have evaluated our algorithms in sorting, hashing, and the sparse matrix-vector multiplication in comparison with their optimized CPU counterparts. Our results show that these optimizations yield 2--4X improvement on the GPU bandwidth utilization and 30--50% improvement on the response time. Overall, our optimized GPU implementations are 2--7X faster than their optimized CPU counterparts.
  • Keywords
    Aggregates; Bandwidth; Computer architecture; Computer science; Data engineering; Delay; Graphics; Large-scale systems; Robustness; Scattering; cache optimization; gather; graphics processors; parallel processing; scatter;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Supercomputing, 2007. SC '07. Proceedings of the 2007 ACM/IEEE Conference on
  • Conference_Location
    Reno, NV, USA
  • Print_ISBN
    978-1-59593-764-3
  • Electronic_ISBN
    978-1-59593-764-3
  • Type

    conf

  • DOI
    10.1145/1362622.1362684
  • Filename
    5348805