• DocumentCode
    1298032
  • Title

    A watchdog processor based general rollback technique with multiple retries

  • Author

    Upadhyaya, J. Shambhu ; Saluja, Kewal K.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Newcastle Univ., NSW, Australia
  • Issue
    1
  • fYear
    1986
  • Firstpage
    87
  • Lastpage
    95
  • Abstract
    A common assumption in the existing rollback techniques is that transients, the cause of most failures, subside very quickly, implying that a single story retry of the program from the previous rollback point is sufficient. The authors discuss a general rollback strategy with n(n≥2) retries which takes into consideration multiple transient failures as well as transients of long duration. Ways of deriving practical values of n for a given program are also discussed. Furthermore, the authors propose the use of a watchdog processor as an error detection tool to initiate recovery action through rollback, since the watchdog processor offers low error latency. They also discuss the merging of the watchdog processor with rollback recovery technique for enhancing the overall system reliability.
  • Keywords
    fault tolerant computing; program processors; software reliability; system recovery; error detection tool; multiple retries; recovery action; rollback technique; system reliability; transients; watchdog processor; Australia; Computational modeling; Computers; Hardware; Image edge detection; Load modeling; Transient analysis; Error detection; error latency; program retry; recovery time; rollback recovery; transient errors;
  • fLanguage
    English
  • Journal_Title
    Software Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0098-5589
  • Type

    jour

  • DOI
    10.1109/TSE.1986.6312923
  • Filename
    6312923