• DocumentCode
    55715
  • Title

    Improving GPU Memory Performancewith Artificial Barrier Synchronization

  • Author

    Shih-Hsiang Lo ; Che-Rung Lee ; Quey-Liang Kao ; I-Hsin Chung ; Yeh-Ching Chung

  • Author_Institution
    Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
  • Volume
    25
  • Issue
    9
  • fYear
    2014
  • fDate
    Sept. 2014
  • Firstpage
    2342
  • Lastpage
    2352
  • Abstract
    Barrier synchronization, an essential mechanism for a block of threads to guard data consistency, is regarded as a threat to performance. This study, however, provides a different viewpoint for barrier synchronization on GPUs: adding barrier synchronization, even when functionally unnecessary, can improve the performance of some memory-intensive applications. We explain this phenomenon using a memory contention model in which artificial barrier synchronization helps reduce memory contention and preserve data access locality. To yield practical applications, we identify a program pattern: artificial barrier synchronization can be used to synchronize the memory accesses when the data locality among threads is violated. Empirical results from three real-world applications demonstrate that artificial barrier synchronization can increase performance by 10 to 20 percent.
  • Keywords
    cache storage; data integrity; graphics processing units; performance evaluation; synchronisation; GPU memory performance; artificial barrier synchronization; data access locality; data consistency; memory contention model; memory-intensive applications; program pattern; Graphics processing units; Instruction sets; Kernel; Message systems; Performance evaluation; Random access memory; Synchronization; Graphics processors; parallel languages; resource contention; synchronization;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2013.133
  • Filename
    6515115