مرکز منطقه ای اطلاع رساني علوم و فناوري - VWS: A versatile warp scheduler for exploring diverse cache localities of GPGPU applications

DocumentCode :

726357

Title :

VWS: A versatile warp scheduler for exploring diverse cache localities of GPGPU applications

Author :

Mengjie Mao ; Jingtong Hu ; Yiran Chen ; Hai Li

Author_Institution :

Dept. of Electr. & Comput. Eng., Univ. of Pittsburgh, Pittsburgh, PA, USA

fYear :

2015

fDate :

8-12 June 2015

Firstpage :

Lastpage :

Abstract :

Massive multi-threading of GPGPU demands for efficient usage of caches with limited capacity. In this work, we propose a versatile warp scheduler (VWS) to reduce the cache miss rate in GPGPU. VWS retains the intra-warp cache locality using an efficient per-warp working set estimator and enhances intra-/inter-cooperative thread array (CTA) cache locality through imposing a CTA-aware scheduling policy and a new CTA dispatching mechanism. The significantly improved hit rate of cache hierarchy enables VWS to achieve on average 38.4% and 9.3% IPC improvement across diverse GPGPU applications compared to a widely-used and a state-of-the-art warp schedulers, respectively.

Keywords :

cache storage; graphics processing units; multi-threading; processor scheduling; CTA dispatching mechanism; CTA-aware scheduling policy; GPGPU application; IPC; VWS; cache hierarchy; cache locality; cache miss rate; intra-inter (CTA); intra-inter-cooperative thread array; multithreading; versatile warp scheduler; Dispatching; Hardware; Instruction sets; Kernel; Mathematical model; Radiation detectors; Scheduling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Design Automation Conference (DAC), 2015 52nd ACM/EDAC/IEEE

Conference_Location :

San Francisco, CA

Type :

conf

DOI :

10.1145/2744769.2744931

Filename :

7167267

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=726357