DocumentCode :
626670
Title :
DRAM access reduction in GPUs by thread-block scheduling for overlapped data reuse
Author :
Seungyeol Lee ; Wonyong Sung
Author_Institution :
Dept. of Electr. Eng., Seoul Nat. Univ., Seoul, South Korea
fYear :
2013
fDate :
19-23 May 2013
Firstpage :
901
Lastpage :
904
Abstract :
General Purpose Graphics Processing Units (GPG-PUs) show very high throughput when executing parallel programs. However, they usually demand very large DRAM bandwidth and consume much power for memory access. Although recent high performance GPGPUs equip L2 cache to absorb some of DRAM accesses, the cache hit ratio can hardly be very high because of the limited cache size. We propose a GPU thread-block scheduling method that can better utilize L2 cache and reduce the DRAM memory access. This scheduling method exploits the inter-block locality in the scheduling of GPU thread-blocks. This method can easily be implemented by modifying application programs. This technique is applied to the Hotspot benchmark programs, and reduces the DRAM access by up to 39%.
Keywords :
DRAM chips; cache storage; graphics processing units; scheduling; DRAM access reduction; DRAM bandwidth; DRAM memory access; GPU; Hotspot benchmark programs; L2 cache; application programs; cache hit ratio; cache size; general purpose graphics processing units; inter-block locality; overlapped data reuse; parallel programs; thread-block scheduling; Cache memory; Computer architecture; Graphics processing units; Instruction sets; Message systems; Random access memory; Strips;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Circuits and Systems (ISCAS), 2013 IEEE International Symposium on
Conference_Location :
Beijing
ISSN :
0271-4302
Print_ISBN :
978-1-4673-5760-9
Type :
conf
DOI :
10.1109/ISCAS.2013.6571993
Filename :
6571993
Link To Document :
بازگشت