• DocumentCode
    1307175
  • Title

    Meeting Soft Deadlines in Scientific Workflows Using Resubmission Impact

  • Author

    Plankensteiner, Kassian ; Prodan, Radu

  • Author_Institution
    Inst. of Comput. Sci., Univ. of Innsbruck, Innsbruck, Austria
  • Volume
    23
  • Issue
    5
  • fYear
    2012
  • fDate
    5/1/2012 12:00:00 AM
  • Firstpage
    890
  • Lastpage
    901
  • Abstract
    We propose a new heuristic called Resubmission Impact to support fault tolerant execution of scientific workflows in heterogeneous parallel and distributed computing environments. In contrast to related approaches, our method can be effectively used on new or unfamiliar environments, even in the absence of historical executions or failure trace models. On top of this method, we propose a dynamic enactment and rescheduling heuristic able to execute workflows with a high degree of fault tolerance, while taking into account soft deadlines. Simulated experiments of three real-world workflows in the Austrian Grid demonstrate that our method significantly reduces the resource waste compared to conservative task replication and resubmission techniques, while having a comparable makespan and only a slight decrease in the success probability. On the other hand, the dynamic enactment method manages to successfully meet soft deadlines in faulty environments in the absence of historical failure trace information or models.
  • Keywords
    grid computing; parallel processing; probability; scheduling; workflow management software; Austrian Grid workflow; conservative task replication technique; distributed computing environment; dynamic enactment heuristic; dynamic enactment method; failure trace model; fault tolerance; heterogeneous parallel computing environment; historical execution model; rescheduling heuristic; resubmission impact heuristic; resubmission technique; scientific workflow; soft deadline meeting; success probability; workflow execution; Computational modeling; Fault tolerance; Fault tolerant systems; Heuristic algorithms; Processor scheduling; Quality of service; Schedules; Scientific workflows; cloud computing; fault tolerance; grid computing.; scheduling;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2011.221
  • Filename
    5999661