DocumentCode
3543584
Title
SIMT Microscheduling: Reducing Thread Stalling in Divergent Iterative Algorithms
Author
Frey, Steffen ; Reina, Guido ; Ertl, Thomas
Author_Institution
Visualization Res. Center, Univ. of Stuttgart (VISUS), Stuttgart, Germany
fYear
2012
fDate
15-17 Feb. 2012
Firstpage
399
Lastpage
406
Abstract
The global scheduler of a current GPU distributes thread blocks to symmetric multiprocessors (SM), which schedule threads for execution with the granularity of a warp. Threads in a warp execute the same code path in lockstep, which potentially leads to a large amount of wasted cycles for divergent control flow. In order to overcome this general issue of SIMT architectures, we propose techniques to relax divergence on the fly within a computation kernel in order to achieve a much higher total utilization of processing cores. We propose techniques for branch and loop divergence (which may also be combined) switching to suitable tasks during a GPU kernel run every time divergence occurs. Our newly introduced techniques can easily be applied to arbitrary iterative algorithms and we evaluate the performance and effectiveness of our approach exemplarily via synthetic and real world applications.
Keywords
graphics processing units; iterative methods; mathematics computing; multiprocessing systems; scheduling; GPU global scheduler; SIMT microscheduling; branch-and-loop divergence technique; computation kernel; divergent control flow; divergent iterative algorithm; graphics processing unit; processing core utilization; symmetric multiprocessor; thread block scheduling; thread stalling reduction; warp granularity; Context; Graphics processing unit; Hardware; Instruction sets; Kernel; Memory management; Switches; Divergence; GPU; Scheduling;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel, Distributed and Network-Based Processing (PDP), 2012 20th Euromicro International Conference on
Conference_Location
Garching
ISSN
1066-6192
Print_ISBN
978-1-4673-0226-5
Type
conf
DOI
10.1109/PDP.2012.62
Filename
6169578
Link To Document