• DocumentCode
    2041714
  • Title

    GPU-ABiSort: optimal parallel sorting on stream architectures

  • Author

    Greb, A. ; Zachmann, Gabriel

  • Author_Institution
    Inst. of Comput. Sci. II, Rhein. Friedr.-Wilh.-Univ. Bonn, Germany
  • fYear
    2006
  • fDate
    25-29 April 2006
  • Abstract
    In this paper, we present a novel approach for parallel sorting on stream processing architectures. It is based on adaptive bitonic sorting. For sorting n values utilizing p stream processor units, this approach achieves the optimal time complexity O((n log n)/p). While this makes our approach competitive with common sequential sorting algorithms not only from a theoretical viewpoint, it is also very fast from a practical viewpoint. This is achieved by using efficient linear stream memory accesses (and by combining the optimal time approach with algorithms optimized for small input sequences). We present an implementation on modern programmable graphics hardware (GPUs). On GPUs, our optimal parallel sorting approach has shown to be remarkably faster than sequential sorting on the CPU, and it is also faster than previous non-optimal sorting approaches on the GPU for sufficiently large input sequences. Because of the excellent scalability of our algorithm with the number of stream processor units p (up to n/log2 n or even n/log n units, depending on the stream architecture), our approach profits heavily from the trend of increasing number of fragment processor units on GPUs, so that we can expect further speed improvement with upcoming GPU generations.
  • Keywords
    computational complexity; computer graphic equipment; coprocessors; parallel processing; sorting; adaptive bitonic sorting; linear stream memory accesses; optimal parallel sorting; programmable graphics hardware; stream processing architectures; time complexity; Computer architecture; Computer science; Graphics; Hardware; Parallel algorithms; Parallel architectures; Parallel programming; Programming profession; Scalability; Sorting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
  • Print_ISBN
    1-4244-0054-6
  • Type

    conf

  • DOI
    10.1109/IPDPS.2006.1639284
  • Filename
    1639284