DocumentCode :
1787634
Title :
Using multi-level cell STT-RAM for fast and energy-efficient local checkpointing
Author :
Ping Chi ; Cong Xu ; Tao Zhang ; Xiangyu Dong ; Yuan Xie
Author_Institution :
Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
fYear :
2014
fDate :
2-6 Nov. 2014
Firstpage :
301
Lastpage :
308
Abstract :
High reliability, availability, and serviceability are critical for modern large-scale computing systems. As an effective error recovery mechanism, checkpointing has been widely used in such systems for their survival from unexpected failures. The conventional checkpointing schemes, however, are time-consuming due to the limited I/O bandwidth between the DRAM-based main memory and the backup storage. To mitigate the checkpoint overhead, we propose a fast local checkpointing scheme by leveraging Multi-Level Cell (MLC) STT-RAM. We take advantage of the unique features of MLC STT-RAM to accelerate local checkpointing. Our experimental results show that the average performance overhead is less than 1% in a multi-programmed four-core process node with a 1-second local checkpoint interval. The evaluation results also demonstrate that using MLC STT-RAM is an energy-efficient solution.
Keywords :
checkpointing; microcomputers; random-access storage; DRAM-based main memory; MLC STT-RAM; backup storage; checkpointing schemes; error recovery mechanism; large-scale computing systems; multilevel cell STT-RAM; spin transfer torque random access memory; Checkpointing; Magnetic tunneling; Nonvolatile memory; Phase change random access memory; Resistance; Switches;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer-Aided Design (ICCAD), 2014 IEEE/ACM International Conference on
Conference_Location :
San Jose, CA
Type :
conf
DOI :
10.1109/ICCAD.2014.7001367
Filename :
7001367
Link To Document :
بازگشت