• DocumentCode
    2265464
  • Title

    On checkpointing strategies in unreliable computing environments

  • Author

    Fiorini, P.M.

  • Author_Institution
    C&F Search Marketing, Miami, FL, USA
  • Volume
    1
  • fYear
    2011
  • fDate
    15-17 Sept. 2011
  • Firstpage
    193
  • Lastpage
    197
  • Abstract
    In this paper, we analyze performance implications of checkpointing strategies in unreliable computing environments. We show that if the appropriate checkpointing strategy is not chosen, the time to complete a job is heavy-tailed distributed. This can lead to highly-variable and long completion times. We generate asymptotics for job completion times when there is no checkpointing, a fixed number of random checkpoints, and when checkpoints occur at fixed intervals for various task time distributions. Our asymptotic results are derived using large deviation theory.
  • Keywords
    checkpointing; reliability; ubiquitous computing; asymptotics; checkpointing strategies; deviation theory; fixed intervals; heavy tailed distributed system; job completion times; random checkpoints; task time distribution; unreliable computing environments; Checkpointing; Computational modeling; Equations; Markov processes; Mathematical model; Random variables; Tin; RESTART; asymptotics; checkpointing; failure; heavy-tail; large deviation theory; pri; recovery; unreliable systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Data Acquisition and Advanced Computing Systems (IDAACS), 2011 IEEE 6th International Conference on
  • Conference_Location
    Prague
  • Print_ISBN
    978-1-4577-1426-9
  • Type

    conf

  • DOI
    10.1109/IDAACS.2011.6072739
  • Filename
    6072739