• DocumentCode
    3333144
  • Title

    Design and evaluation of fault tolerance techniques for highly parallel architectures

  • Author

    Abraham, Jacob A.

  • Author_Institution
    Comput. Eng. Res. Center, Texas Univ., Austin, TX, USA
  • fYear
    1991
  • fDate
    1-2 Mar 1991
  • Firstpage
    12
  • Abstract
    Summary form only given. The author discusses fault tolerance techniques for computer systems, including a new technique, which he calls algorithm-based fault tolerance, for error detection and correction when computations are performed using multiple processor systems. The technique uses knowledge about the algorithm to reduce the amount of overhead necessary for fault tolerance. This is done by appropriately encoding the data and tailoring the algorithms to operate on the encoded data and produce encoded output data. Examples are given of applications including matrix operations, fast Fourier transforms, and computation of eigenvalues
  • Keywords
    error correction; error detection; fault tolerant computing; parallel architectures; FFT; algorithm-based fault tolerance; computer systems; eigenvalues; encoded data; error detection; fast Fourier transforms; highly parallel architectures; matrix operations; multiple processor systems; Application software; Circuit faults; Concurrent computing; Encoding; Fabrication; Fault tolerance; Fault tolerant systems; Integrated circuit technology; Jacobian matrices; Parallel architectures;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    VLSI, 1991. Proceedings., First Great Lakes Symposium on
  • Conference_Location
    Kalamazoo, MI
  • Print_ISBN
    0-8186-2170-2
  • Type

    conf

  • DOI
    10.1109/GLSV.1991.143934
  • Filename
    143934