DocumentCode
3333144
Title
Design and evaluation of fault tolerance techniques for highly parallel architectures
Author
Abraham, Jacob A.
Author_Institution
Comput. Eng. Res. Center, Texas Univ., Austin, TX, USA
fYear
1991
fDate
1-2 Mar 1991
Firstpage
12
Abstract
Summary form only given. The author discusses fault tolerance techniques for computer systems, including a new technique, which he calls algorithm-based fault tolerance, for error detection and correction when computations are performed using multiple processor systems. The technique uses knowledge about the algorithm to reduce the amount of overhead necessary for fault tolerance. This is done by appropriately encoding the data and tailoring the algorithms to operate on the encoded data and produce encoded output data. Examples are given of applications including matrix operations, fast Fourier transforms, and computation of eigenvalues
Keywords
error correction; error detection; fault tolerant computing; parallel architectures; FFT; algorithm-based fault tolerance; computer systems; eigenvalues; encoded data; error detection; fast Fourier transforms; highly parallel architectures; matrix operations; multiple processor systems; Application software; Circuit faults; Concurrent computing; Encoding; Fabrication; Fault tolerance; Fault tolerant systems; Integrated circuit technology; Jacobian matrices; Parallel architectures;
fLanguage
English
Publisher
ieee
Conference_Titel
VLSI, 1991. Proceedings., First Great Lakes Symposium on
Conference_Location
Kalamazoo, MI
Print_ISBN
0-8186-2170-2
Type
conf
DOI
10.1109/GLSV.1991.143934
Filename
143934
Link To Document