Title :
Bamboo ECC: Strong, safe, and flexible codes for reliable computer memory
Author :
Kim, Jungrae ; Sullivan, Michael ; Erez, Mattan
Author_Institution :
Department of Electrical and Computer Engineering The University of Texas at Austin
Abstract :
Growing computer system sizes and levels of integration have made memory reliability a primary concern, necessitating strong memory error protection. As such, large-scale systems typically employ error checking and correcting codes to trade redundant storage and bandwidth for increased reliability. While stronger memory protection will be needed to meet reliability targets in the future, it is undesirable to further increase the amount of storage and bandwidth spent on redundancy. We propose a novel family of single-tier ECC mechanisms called Bamboo ECC to simultaneously address the conflicting requirements of increasing reliability while maintaining or decreasing error protection overheads. Relative to the state-of-the-art single-tier error protection, Bamboo ECC codes have superior correction capabilities, all but eliminate the risk of silent data corruption, and can also increase redundancy at a fine granularity, enabling more adaptive graceful downgrade schemes. These strength, safety, and flexibility advantages translate to a significantly more reliable memory system. To demonstrate this, we evaluate a family of Bamboo ECC organizations in the context of conventional 72b and 144b DRAM channels and show the significant error coverage and memory lifespan improvements of Bamboo ECC relative to existing SEC-DED, chipkill-correct and double-chipkill-correct schemes.
Keywords :
DRAM chips; Decoding; Error correction; Error correction codes; Redundancy;
Conference_Titel :
High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on
Conference_Location :
Burlingame, CA, USA
DOI :
10.1109/HPCA.2015.7056025