DocumentCode
2453783
Title
CCK: An Improved Coordinated Checkpoint/Rollback Protocol for Dataflow Applications in Kaapi
Author
Besseron, Xavier ; Jafar, Samir ; Gautier, Thierry ; Roch, Jean-Louis
Author_Institution
Lab. ID-IMAG, Projet MOAIS(CNRS/INPG/INRIA/UJF), Monbonnot
Volume
2
fYear
0
fDate
0-0 0
Firstpage
3353
Lastpage
3358
Abstract
Fault tolerance protocols play an important role in today long runtime scientific parallel applications because the probability of failure may be important due to the number of unreliable components involved during simulation. In this paper we present our approach and preliminary results about a new checkpoint/recovery protocol based on a coordinated scheme. This protocol is highly coupled to the availability of an abstract representation of the execution
Keywords
checkpointing; data flow computing; data flow graphs; software fault tolerance; KAAPI application; coordinated checkpoint/rollback protocol; dataflow application; dataflow graph; execution abstract representation; fault tolerance protocols; runtime scientific parallel application; Computational modeling; Concurrent computing; Context modeling; Fault tolerance; Fault tolerant systems; Large-scale systems; Middleware; Protocols; Runtime; Virtual reality; Checkpoint/Recovery; Dataflow Graph; Parallel Application;
fLanguage
English
Publisher
ieee
Conference_Titel
Information and Communication Technologies, 2006. ICTTA '06. 2nd
Conference_Location
Damascus
Print_ISBN
0-7803-9521-2
Type
conf
DOI
10.1109/ICTTA.2006.1684955
Filename
1684955
Link To Document