Title :
Roll-forward and rollback recovery: performance-reliability trade-off
Author :
Pradhan, D.K. ; Vaidya, N.H.
Author_Institution :
Dept. of Comput. Sci., Texas A&M Univ., College Station, TX, USA
Abstract :
Performance and reliability achieved by a modular redundant system depend on the recovery scheme used. Typically, gain in performance using comparable resources results in reduced reliability. Several high performance computers are noted for small mean time to failure. Performance is measured here in terms of mean and variance of the task completion time, reliability being a task-based measure defined as the probability that a task is completed correctly. Two roll-forward schemes are compared with two rollback schemes for achieving recovery in duplex systems. The roll-forward schemes discussed here are based on a roll-forward checkpointing concept. Roll-forward recovery schemes achieve significantly better performance than rollback schemes by avoiding rollback in most common fault scenarios. It is shown that the roll-forward schemes improve performance with only a small loss in reliability as compared to rollback schemes.<>
Keywords :
fault tolerant computing; performance evaluation; reliability; checkpointing concept; duplex systems; high performance computers; modular redundant system; performance reliability; roll-forward recovery; rollback recovery; task-based measure; Checkpointing; Computer science; High performance computing; Performance gain; Reliability; Time measurement;
Conference_Titel :
Fault-Tolerant Computing, 1994. FTCS-24. Digest of Papers., Twenty-Fourth International Symposium on
Conference_Location :
Austin, TX, USA
Print_ISBN :
0-8186-5520-8
DOI :
10.1109/FTCS.1994.315642