Title :
Impact of Error Correction Code and Dynamic Memory Reconfiguration on High-Reliability/Low-Cost Server Memory
Author :
Slayman, Charles ; Ma, Manny ; Lindley, Scott
Author_Institution :
Sun Microsystems, Inc., Santa Clara, CA
fDate :
Oct. 16 2006-Sept. 19 2006
Abstract :
History has shown that DRAM technology shrinks as the server memory density grows and, at the same time, user expectation of system uptime increases. Given this, new mitigation techniques are required to reduce the impact of DRAM faults on server reliability, availability, and serviceability (RAS). This study shows the trade-offs in the effectiveness of two commonly used error correction codes (ECC) and two dynamic memory reconfiguration (DMR) schemes with various types of anticipated memory failures. This study proposes a "RAS intelligent" way to look at device reliability as DRAM technology scales below 100nm
Keywords :
DRAM chips; circuit reliability; error correction codes; file servers; memory architecture; reconfigurable architectures; DRAM technology; device reliability; dynamic memory reconfiguration; error correction code; high-reliability server memory; low-cost server memory; memory failures; mitigation techniques; server availability; server reliability; server serviceability; Cosmic rays; Costs; Electronic mail; Error correction; Error correction codes; Network servers; Pins; Protection; Random access memory; Sun;
Conference_Titel :
Integrated Reliability Workshop Final Report, 2006 IEEE International
Conference_Location :
South Lake Tahoe, CA
Print_ISBN :
1-4244-0296-4
Electronic_ISBN :
1930-8841
DOI :
10.1109/IRWS.2006.305243