DocumentCode
3180781
Title
A proposed approach to recovery from transient faults in wavefront processor arrays
Author
Murthy, Vinay ; Gray, F. Gail ; Davis, Nathaniel J., IV
Author_Institution
Bradley Dept. of Electr. Eng., Virginia Polytech. Inst. & State Univ., Blacksburg, VA, USA
fYear
1993
fDate
4-7 Apr 1993
Firstpage
0.708333333333333
Abstract
The need for a transient fault recovery procedure in a wavefront processor array is examined. The idea of data rollback to recover from transient faults is considered. Data rollback is achieved by creating checkpoints at different instants of time; when an error occurs, backtracking is done to a consistent state and computation resumes from that state. An algorithm for data rollback in wavefront processor arrays is developed. After the fault is detected, the error information is propagated throughout the array. A distributed recovery algorithm is employed, so that there is no single point of failure in the fault recovery mechanism. An important criterion in the data-rollback algorithm is that the processing elements are designed to be totally self checking; this ensures that there is no rollback propagation
Keywords
parallel processing; system recovery; transients; backtracking; data rollback; distributed recovery algorithm; error information; processing elements; transient fault recovery; wavefront processor arrays; Circuit faults; Fault detection; Fault tolerance; Image processing; Pipelines; Process design; Resumes; Signal processing; Throughput; Very large scale integration;
fLanguage
English
Publisher
ieee
Conference_Titel
Southeastcon '93, Proceedings., IEEE
Conference_Location
Charlotte, NC
Print_ISBN
0-7803-1257-0
Type
conf
DOI
10.1109/SECON.1993.465706
Filename
465706
Link To Document