مرکز منطقه ای اطلاع رساني علوم و فناوري - Accurately modeling the GPU memory subsystem

DocumentCode :

2026919

Title :

Accurately modeling the GPU memory subsystem

Author :

Candel, Francisco ; Petit, Salvador ; Sahuquillo, Julio ; Duato, Jose

Author_Institution :

Dept. of Comput. Eng., Univ. Politec. de Valencia, Valencia, Spain

fYear :

2015

fDate :

20-24 July 2015

Firstpage :

179

Lastpage :

186

Abstract :

Nowadays, research on GPU processor architecture is extraordinarily active since these architectures offer much more performance per watt than CPU architectures. This is the main reason why massive deployment of GPU multiprocessors is considered one of the most feasible solutions to attain exascale computing capabilities. In this context, ongoing GPU architecture research is required to improve GPU programmability as well as to integrate CPU and GPU cores in the same die. One of the most important research topics in current GPUs, is the GPU memory hierarchy, since its design goals are very different from those of conventional CPU memory hierarchies. To explore novel designs to better support General Purpose computing in GPUs (GPGPU computing) as well as to improve the performance of GPU and CPU/GPU systems, researchers often require advanced microarchitectural simulators with detailed models of the memory subsystem. Nevertheless, due to fast speed at which current GPU architectures evolve, simulation accuracy of existing state-of-the-art simulators suffers. This paper focuses on accurately modeling the GPU memory subsystem. We identified three main aspects that should be modeled with more accuracy: i) miss status holding registers, ii) coalescing vector memory requests, and iii) non-blocking GPU stores. In this sense, we extend the Multi2Sim heterogeneous CPU/GPU processor simulator to model these aspects with enough accuracy. Experimental results show that if these aspects are not considered in the simulation framework, performance deviations can rise in some applications up to 70%, 75%, and 60%, respectively.

Keywords :

graphics processing units; memory architecture; multiprocessing systems; CPU memory hierarchies; GPGPU computing; GPU cores; GPU memory hierarchy; GPU memory subsystem modeling; GPU multiprocessors; GPU processor architecture; GPU programmability; Multi2Sim heterogeneous CPU-GPU processor simulator; advanced microarchitectural simulators; coalescing vector memory requests; exascale computing capabilities; general purpose computing; miss status holding registers; nonblocking GPU stores; Computational modeling; Computer architecture; Graphics processing units; Load modeling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

High Performance Computing & Simulation (HPCS), 2015 International Conference on

Conference_Location :

Amsterdam

Print_ISBN :

978-1-4673-7812-3

Type :

conf

DOI :

10.1109/HPCSim.2015.7237038

Filename :

7237038

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2026919