Title :
Rethinking Prefetching in GPGPUs: Exploiting Unique Opportunities
Author :
Ahmad Lashgar;Amirali Baniasadi
Author_Institution :
Electr. &
Abstract :
In this paper we investigate static memory access predictability in GPGPU workloads, at the thread block granularity. We first show that a significant share of accessed memory addresses can be predicted using thread block identifiers. We build on this observation and introduce a hardware-software prefetching scheme to reduce average memory access time. Our proposed scheme issues the memory requests of thread block before it starts execution. The scheme relies on static analyzer to parse the kernel and find predictable memory accesses. Runtime API calls pass this information to the hardware. Hardware dynamically prefetches the data of each thread block based on this information. In our scheme, prefetch accuracy is controlled by software (static analyzer and API calls) and hardware controls the prefetching timeliness. We introduce few machine models to explore the design space and performance potential behind the scheme. Our evaluation shows that the scheme can achieve a performance improvement of 59% over the baseline without prefetching.
Keywords :
"Indexes","Prefetching","Arrays","Hardware","Kernel","Graphics processing units"
Conference_Titel :
High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on
DOI :
10.1109/HPCC-CSS-ICESS.2015.145