DocumentCode :
177354
Title :
STAG: Spintronic-Tape Architecture for GPGPU cache hierarchies
Author :
Venkatesan, R. ; Ramasubramanian, Shankar Ganesh ; Venkataramani, Swagath ; Roy, Kaushik ; Raghunathan, Anand
Author_Institution :
Sch. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA
fYear :
2014
fDate :
14-18 June 2014
Firstpage :
253
Lastpage :
264
Abstract :
General-purpose Graphics Processing Units (GPGPUs) are widely used for executing massively parallel workloads from various application domains. Feeding data to the hundreds to thousands of cores that current GPGPUs integrate places great demands on the memory hierarchy, fueling an ever-increasing demand for on-chip memory. In this work, we propose STAG, a high density, energy-efficient GPGPU cache hierarchy design using a new spintronic memory technology called Domain Wall Memory (DWM). DWMs inherently offer unprecedented benefits in density by storing multiple bits in the domains of a ferromagnetic nanowire, which logically resembles a bit-serial tape. However, this structure also leads to a unique challenge that the bits must be sequentially accessed by performing “shift” operations, resulting in variable and potentially higher access latencies. To address this challenge, STAG utilizes a number of architectural techniques : (i) a hybrid cache organization that employs different DWM bit-cells to realize the different memory arrays within the GPGPU cache hierarchy, (ii) a clustered, bit-interleaved organization, in which the bits in a cache block are spread across a cluster of DWM tapes, allowing parallel access, (iii) tape head management policies that predictively configure DWM arrays to reduce the expected number of shift operations for subsequent accesses, and (iv) a shift aware promotion buffer (SaPB), in which accesses to the DWM cache are predicted based on intra-warp locality, and locations that would incur a large shift penalty are promoted to a smaller buffer. Over a wide range of benchmarks from the Rodinia, ISPASS and Parboil suites, STAG achieves significant benefits in performance (12.1% over SRAM and 5.8% over STT-MRAM) and energy (3.3X over SRAM and 2.6X over STT-MRAM).
Keywords :
cache storage; ferromagnetic materials; graphics processing units; magnetoelectronics; nanowires; DWM; GPGPU cache hierarchies; STAG; cache organization; domain wall memory; ferromagnetic nanowire; general-purpose graphics processing units; memory hierarchy; on-chip memory; spintronic-tape architecture; Arrays; Magnetic domains; Magnetic tunneling; Organizations; Random access memory; Transistors; Wires;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Architecture (ISCA), 2014 ACM/IEEE 41st International Symposium on
Conference_Location :
Minneapolis, MN
Print_ISBN :
978-1-4799-4396-8
Type :
conf
DOI :
10.1109/ISCA.2014.6853233
Filename :
6853233
Link To Document :
بازگشت