• DocumentCode
    3476722
  • Title

    Failure detection algorithms for a reliable execution of parallel programs

  • Author

    Chabridon, Sophie ; Gelenbe, Erol

  • Author_Institution
    UFR de Math. et Inf., Univ. Rene Descartes, Paris, France
  • fYear
    1995
  • fDate
    13-15 Sep 1995
  • Firstpage
    229
  • Lastpage
    238
  • Abstract
    We report on the design and simulation of novel algorithms which will ensure that application software runs correctly on a MIMD system in which processing units (PU) can fail. The effect of these algorithms is evaluated for random task graphs using simulation as failure rates increase. An example of a specific application is also examined (the Fast Fourier Transform) for which we construct the task graph and then simulate its execution under various values of the failure rates of processors
  • Keywords
    fault tolerant computing; parallel processing; reliability; system recovery; MIMD system; failure detection algorithms; failure rates; parallel programs; random task graphs; reliable execution; task graph; Algorithm design and analysis; Application software; Computational modeling; Databases; Delay; Detection algorithms; Fast Fourier transforms; Parallel processing; Software algorithms; Surges;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems, 1995. Proceedings., 14th Symposium on
  • Conference_Location
    Bad Neuenahr
  • ISSN
    1060-9857
  • Print_ISBN
    0-8186-7153-X
  • Type

    conf

  • DOI
    10.1109/RELDIS.1995.526230
  • Filename
    526230