• DocumentCode
    1753482
  • Title

    On demand check pointing for grid application reliability using communicating process model

  • Author

    Baghavathi Priya, S. ; Subramaniam, Chandrasekaran ; Ravichandran, T.

  • Author_Institution
    Jawaharlal Nehru Technol. Univ., Hyderabad, India
  • fYear
    2011
  • fDate
    13-16 Feb. 2011
  • Firstpage
    393
  • Lastpage
    398
  • Abstract
    The objective of the work is to propose an on-demand asynchronous check pointing technique for the fault recovery of a grid application in communicating process approach. The formal modelling of processes using LOTOS is done wherein the process features are declared in terms of possibilities of rollback and replicas permitted to accept the assigned tasks as decided by the scheduler. If any process is tending to be faulty in run time that will be detected by check pointing mechanism through the Task Dependency Graph (TDG) and their respective worst case execution time and dead line parameters are used to decide the schedulability. The Asynchronous Check Pointing On Demand (ACP-OD) approach is used to enhance the grid application reliability through the needed fault tolerant services. The scheduling of concurrent tasks can be done using the proposed Concurrent Task Scheduling Algorithm (CTSA) algorithm to recover from the faulty states using replication or rollback techniques. The check pointing and replication mechanisms have been used in which the synchronization between communicating processes is needed to enhance the efficiency of check pointing mechanism. The model is tested with a number of rollback variables treating the application as a Stochastic Activity Network (SAN) using Mobius.
  • Keywords
    grid computing; scheduling; software fault tolerance; LOTOS model; Mobius; asynchronous check pointing on demand approach; asynchronous check pointing technique; communicating process model; concurrent task scheduling algorithm; demand check pointing technique; fault recovery; fault tolerant service; grid application; replication technique; rollback technique; stochastic activity network; task dependency graph; Computational modeling; Fault tolerance; Fault tolerant systems; Information services; Schedules; Synchronization; Check pointing; Process; Reliability; Replication; Rollback;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Communication Technology (ICACT), 2011 13th International Conference on
  • Conference_Location
    Seoul
  • ISSN
    1738-9445
  • Print_ISBN
    978-1-4244-8830-8
  • Type

    conf

  • Filename
    5745839