DocumentCode
2515288
Title
GRS — GPU radix sort for multifield records
Author
Bandyopadhyay, Shibdas ; Sahni, Sartaj
Author_Institution
Dept. of Comput. & Inf. Sci. & Eng., Univ. of Florida, Gainesville, FL, USA
fYear
2010
fDate
19-22 Dec. 2010
Firstpage
1
Lastpage
10
Abstract
We develop a radix sort algorithm, GRS, suitable to sort multifield records on a graphics processing unit (GPU). We assume the ByField layout for records to be sorted. GRS is benchmarked against the radix sort algorithm, SDK, in NVIDIA´s CUDA SDK 3.0 as well as the radix sort algorithm, SRTS, of Merrill and Grimshaw. Although SRTS is faster than both GRS and SDK when sorting numbers as well as records that have a key and an additional 32-bit field, both GRS and SDK outperform SRTS on records with 2 or more fields (in addition to the key). GRS is consistently faster than SDK on numbers as well as records with 1 or more fields. When sorting records with 9 32-bit fields, GRS is up to 74% faster than SRTS and up to 55% faster than SDK. Thus, GRS is the fastest way to radix sort records with more than 1 32-bit field on a GPU.
Keywords
computer graphic equipment; coprocessors; parallel architectures; records management; sorting; ByField layout; GRS-GPU radix sort algorithm; NVIDIA CUDA SDK 3.0; SRTS; compute unified driver architecture; graphics processing unit; multifield record; Graphics processing unit; Histograms; Instruction sets; Kernel; Layout; Registers; Tiles; Graphics Processing Units; radix sort; sorting multifield records;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing (HiPC), 2010 International Conference on
Conference_Location
Dona Paula
Print_ISBN
978-1-4244-8518-5
Electronic_ISBN
978-1-4244-8519-2
Type
conf
DOI
10.1109/HIPC.2010.5713164
Filename
5713164
Link To Document