DocumentCode :
1786779
Title :
Multi-layer memory resiliency
Author :
Dutt, Nikil ; Gupta, Puneet ; Nicolau, A. ; BanaiyanMofrad, Abbas ; Gottscho, M. ; Shoushtari, Majid
Author_Institution :
Dept. of Comput. Sci., Univ. of California, Irvine, Irvine, CA, USA
fYear :
2014
fDate :
1-5 June 2014
Firstpage :
1
Lastpage :
6
Abstract :
With memories continuing to dominate the area, power, cost and performance of a design, there is a critical need to provision reliable, high-performance memory bandwidth for emerging applications. Memories are susceptible to degradation and failures from a wide range of manufacturing, operational and environmental effects, requiring a multi-layer hardware/software approach that can tolerate, adapt and even opportunistically exploit such effects. The overall memory hierarchy is also highly vulnerable to the adverse effects of variability and operational stress. After reviewing the major memory degradation and failure modes, this paper describes the challenges for dependability across the memory hierarchy, and outlines research efforts to achieve multi-layer memory resilience using a hardware/software approach. Two specific exemplars are used to illustrate multi-layer memory resilience: first we describe static and dynamic policies to achieve energy savings in caches using aggressive voltage scaling combined with disabling faulty blocks; and second we show how software characteristics can be exposed to the architecture in order to mitigate the aging of large register files in GPGPUs. These approaches can further benefit from semantic retention of application intent to enhance memory dependability across multiple abstraction levels, including applications, compilers, run-time systems, and hardware platforms.
Keywords :
cache storage; energy conservation; graphics processing units; hardware-software codesign; power aware computing; GPGPU; abstraction levels; cache; dynamic policy; energy savings; failure modes; faulty blocks; high-performance memory bandwidth; memory degradation; memory dependability; memory hierarchy; multilayer hardware-software approach; multilayer memory resiliency; operational stress; software characteristics; static policy; variability stress; voltage scaling; Circuit faults; Hardware; Memory management; Random access memory; Reliability; Resilience; Software;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Design Automation Conference (DAC), 2014 51st ACM/EDAC/IEEE
Conference_Location :
San Francisco, CA
Type :
conf
DOI :
10.1145/2593069.2596684
Filename :
6881375
Link To Document :
بازگشت