• DocumentCode
    2453783
  • Title

    CCK: An Improved Coordinated Checkpoint/Rollback Protocol for Dataflow Applications in Kaapi

  • Author

    Besseron, Xavier ; Jafar, Samir ; Gautier, Thierry ; Roch, Jean-Louis

  • Author_Institution
    Lab. ID-IMAG, Projet MOAIS(CNRS/INPG/INRIA/UJF), Monbonnot
  • Volume
    2
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    3353
  • Lastpage
    3358
  • Abstract
    Fault tolerance protocols play an important role in today long runtime scientific parallel applications because the probability of failure may be important due to the number of unreliable components involved during simulation. In this paper we present our approach and preliminary results about a new checkpoint/recovery protocol based on a coordinated scheme. This protocol is highly coupled to the availability of an abstract representation of the execution
  • Keywords
    checkpointing; data flow computing; data flow graphs; software fault tolerance; KAAPI application; coordinated checkpoint/rollback protocol; dataflow application; dataflow graph; execution abstract representation; fault tolerance protocols; runtime scientific parallel application; Computational modeling; Concurrent computing; Context modeling; Fault tolerance; Fault tolerant systems; Large-scale systems; Middleware; Protocols; Runtime; Virtual reality; Checkpoint/Recovery; Dataflow Graph; Parallel Application;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Communication Technologies, 2006. ICTTA '06. 2nd
  • Conference_Location
    Damascus
  • Print_ISBN
    0-7803-9521-2
  • Type

    conf

  • DOI
    10.1109/ICTTA.2006.1684955
  • Filename
    1684955