• DocumentCode
    2194274
  • Title

    On the Scheduling of Checkpoints in Desktop Grids

  • Author

    Bouguerra, Mohamed Slim ; Kondo, Derrick ; Trystram, Denis

  • Author_Institution
    INRIA Rhone-Alpes Grenoble, ZIRST, Monbonnot Saint Martin, France
  • fYear
    2011
  • fDate
    23-26 May 2011
  • Firstpage
    305
  • Lastpage
    313
  • Abstract
    Frequent resources failures are a major challenge for the rapid completion of batch jobs. Check pointing and migration is one approach to accelerate job completion avoiding deadlock. We study the problem of scheduling checkpoints of sequential jobs in the context of Desktop Grids, consisting of volunteered distributed resources. We craft a checkpoint scheduling algorithm that is provably optimal for discrete time when failures obey any general probability distribution. We show using simulations with parameters based on real-world systems that this optimal strategy scales and outperforms other strategies significantly in terms of check pointing costs and batch completion times.
  • Keywords
    checkpointing; grid computing; batch completion times; batch jobs; checkpoint scheduling algorithm; checkpointing costs; desktop grids; discrete time; general probability distribution; job completion; volunteered distributed resources; Availability; Bandwidth; Checkpointing; Computational modeling; Markov processes; Processor scheduling; Servers; Checkpoint; Fault tolerance; Voulenteer Computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM International Symposium on
  • Conference_Location
    Newport Beach, CA
  • Print_ISBN
    978-1-4577-0129-0
  • Electronic_ISBN
    978-0-7695-4395-6
  • Type

    conf

  • DOI
    10.1109/CCGrid.2011.63
  • Filename
    5948621