DocumentCode :
3706518
Title :
Assessing the Impact of Partial Verifications against Silent Data Corruptions
Author :
Aurélien ;Saurabh K. Raina;Yves Robert;Hongyang Sun
Author_Institution :
INRIA, Ecole Normale Super. de Lyon, Lyon, France
fYear :
2015
Firstpage :
440
Lastpage :
449
Abstract :
Silent errors, or silent data corruptions, constitute a major threat on very large scale platforms. When a silent error strikes, it is not detected immediately but only after some delay, which prevents the use of pure periodic check pointing approaches devised for fail-stop errors. Instead, check pointing must be coupled with some verification mechanism to guarantee that corrupted data will never be written into the checkpoint file. Such a guaranteed verification mechanism typically incurs a high cost. In this paper, we assess the impact of using partial verification mechanisms in addition to a guaranteed verification. The main objective is to investigate to which extent it is worthwhile to use some light cost but less accurate verifications in the middle of a periodic computing pattern, which ends with a guaranteed verification right before each checkpoint. Introducing partial verifications dramatically complicates the analysis, but we are able to analytically determine the optimal computing pattern (up to the first-order approximation), including the optimal length of the pattern, the optimal number of partial verifications, as well as their optimal positions inside the pattern. Performance evaluations based on a wide range of parameters confirm the benefit of using partial verifications under certain scenarios, when compared to the baseline algorithm that uses only guaranteed verifications.
Keywords :
"Checkpointing","Protocols","Approximation methods","Redundancy","Analytical models","Performance evaluation","Resilience"
Publisher :
ieee
Conference_Titel :
Parallel Processing (ICPP), 2015 44th International Conference on
ISSN :
0190-3918
Type :
conf
DOI :
10.1109/ICPP.2015.53
Filename :
7349599
Link To Document :
بازگشت