DocumentCode :
704751
Title :
DRAW: investigating benefits of adaptive fetch group size on GPU
Author :
Myung Kuk Yoon ; Yunho Oh ; Sangpil Lee ; Seung Hun Kim ; Deokho Kim ; Won Woo Ro
Author_Institution :
Yonsei Univ., Seoul, South Korea
fYear :
2015
fDate :
29-31 March 2015
Firstpage :
183
Lastpage :
192
Abstract :
Previously, hiding operation stalls is one of the important issues to suppress performance degradation of Graphics Processing Units (GPUs). In this paper, we first conduct a detailed study of factors affecting the operation stalls in terms of the fetch group size on the warp scheduler. Throughout this paper, we find that the size of fetch group is highly involved in hiding various types of operation stalls. The short latency stalls can be hidden by issuing other available warps from the same fetch group. Therefore, the short latency stalls may not be hidden well under small sized fetch group since the group has the limited number of issuable warps to hide stalls. On the contrary, the long latency stalls can be hidden by dividing warps into multiple fetch groups. The scheduler switches the fetch groups when the warps in each fetch group reach the long latency memory operation point. Therefore, the stalls may not be hidden well at the large sized fetch group. Increasing the size of fetch group reduces the number of fetch groups to hide the stalls. In addition, the load/store unit stalls are caused by the limited hardware resources to handle the memory operations. To hide all these stalls effectively, we propose a Dynamic Resizing on Active Warps (DRAW) scheduler which adjusts the size of active fetch group. From the evaluation results, DRAW scheduler reduces an average of 16.3% of stall cycles and improves an average performance of 11.3% compared to the conventional two-level warp scheduler.
Keywords :
graphics processing units; resource allocation; scheduling; DRAW scheduler; GPU; adaptive fetch group size; dynamic resizing on active warps; graphics processing unit; operation stall; warp scheduler; Clocks; Graphics processing units; Hardware; Hazards; Instruction sets; Registers; Round robin;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Performance Analysis of Systems and Software (ISPASS), 2015 IEEE International Symposium on
Conference_Location :
Philadelphia, PA
Type :
conf
DOI :
10.1109/ISPASS.2015.7095804
Filename :
7095804
Link To Document :
بازگشت