DocumentCode :
803082
Title :
Certification of computational results
Author :
Sullivan, Gregory F. ; Wilson, Dwight S. ; Masson, Gerald M.
Author_Institution :
Dept. of Comput. Sci., Johns Hopkins Univ., Baltimore, MD, USA
Volume :
44
Issue :
7
fYear :
1995
fDate :
7/1/1995 12:00:00 AM
Firstpage :
833
Lastpage :
847
Abstract :
We describe a conceptually novel and powerful technique to achieve fault detection and fault tolerance in hardware and software systems. When used for software fault detection, this new technique uses time and software redundancy and can be outlined as follows. In the initial phase, a program is run to solve a problem and store the result. In addition, this program leaves behind a trail of data which we call a certification trail. In the second phase, another program is run which solves the original problem again. This program however, has access to the certification trail left by the first program. Because of the availability of the certification trail, the second phase can be performed by a less complex program and can execute more quickly. In the final phase, the two results are compared and if they agree the results are accepted as correct; otherwise an error is indicated. An essential aspect of this approach is that the second program must always generate either an error indication or a correct output even when the certification trail it receives from the first program is incorrect. We formalize the certification trail approach to fault tolerance and illustrate realizations of it by considering algorithms for the following problems: convex hull, sorting, and shortest path. We compare the certification trail approach to other approaches to fault tolerance
Keywords :
certification; computational geometry; optimisation; redundancy; software fault tolerance; software standards; sorting; certification trail; computational results certification; convex hull; correct output; error indication; fault tolerance; shortest path; software fault detection; software redundancy; software systems; sorting; time redundancy; Availability; Certification; Error correction; Fault detection; Fault tolerance; Fault tolerant systems; Hardware; Redundancy; Software systems; Sorting;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/12.392843
Filename :
392843
Link To Document :
بازگشت