DocumentCode
3476722
Title
Failure detection algorithms for a reliable execution of parallel programs
Author
Chabridon, Sophie ; Gelenbe, Erol
Author_Institution
UFR de Math. et Inf., Univ. Rene Descartes, Paris, France
fYear
1995
fDate
13-15 Sep 1995
Firstpage
229
Lastpage
238
Abstract
We report on the design and simulation of novel algorithms which will ensure that application software runs correctly on a MIMD system in which processing units (PU) can fail. The effect of these algorithms is evaluated for random task graphs using simulation as failure rates increase. An example of a specific application is also examined (the Fast Fourier Transform) for which we construct the task graph and then simulate its execution under various values of the failure rates of processors
Keywords
fault tolerant computing; parallel processing; reliability; system recovery; MIMD system; failure detection algorithms; failure rates; parallel programs; random task graphs; reliable execution; task graph; Algorithm design and analysis; Application software; Computational modeling; Databases; Delay; Detection algorithms; Fast Fourier transforms; Parallel processing; Software algorithms; Surges;
fLanguage
English
Publisher
ieee
Conference_Titel
Reliable Distributed Systems, 1995. Proceedings., 14th Symposium on
Conference_Location
Bad Neuenahr
ISSN
1060-9857
Print_ISBN
0-8186-7153-X
Type
conf
DOI
10.1109/RELDIS.1995.526230
Filename
526230
Link To Document