DocumentCode :
55715
Title :
Improving GPU Memory Performancewith Artificial Barrier Synchronization
Author :
Shih-Hsiang Lo ; Che-Rung Lee ; Quey-Liang Kao ; I-Hsin Chung ; Yeh-Ching Chung
Author_Institution :
Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
Volume :
25
Issue :
9
fYear :
2014
fDate :
Sept. 2014
Firstpage :
2342
Lastpage :
2352
Abstract :
Barrier synchronization, an essential mechanism for a block of threads to guard data consistency, is regarded as a threat to performance. This study, however, provides a different viewpoint for barrier synchronization on GPUs: adding barrier synchronization, even when functionally unnecessary, can improve the performance of some memory-intensive applications. We explain this phenomenon using a memory contention model in which artificial barrier synchronization helps reduce memory contention and preserve data access locality. To yield practical applications, we identify a program pattern: artificial barrier synchronization can be used to synchronize the memory accesses when the data locality among threads is violated. Empirical results from three real-world applications demonstrate that artificial barrier synchronization can increase performance by 10 to 20 percent.
Keywords :
cache storage; data integrity; graphics processing units; performance evaluation; synchronisation; GPU memory performance; artificial barrier synchronization; data access locality; data consistency; memory contention model; memory-intensive applications; program pattern; Graphics processing units; Instruction sets; Kernel; Message systems; Performance evaluation; Random access memory; Synchronization; Graphics processors; parallel languages; resource contention; synchronization;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2013.133
Filename :
6515115
Link To Document :
بازگشت