• DocumentCode
    2146778
  • Title

    Efficient software-based fault tolerance approach on multicore platforms

  • Author

    Mushtaq, Hamid ; Al-Ars, Zaid ; Bertels, Koen

  • Author_Institution
    Computer Engineering Laboratory, Delft University of Technology, Netherlands
  • fYear
    2013
  • fDate
    18-22 March 2013
  • Firstpage
    921
  • Lastpage
    926
  • Abstract
    This paper describes a low overhead software-based fault tolerance approach for shared memory multicore systems. The scheme is implemented at user-space level and requires almost no changes to the original application. Redundant multithreaded processes are used to detect soft errors and recover from them. Our scheme makes sure that the execution of the redundant processes is identical even in the presence of non-determinism due to shared memory accesses. It provides a very low overhead mechanism to achieve this. Moreover it implements a fast error detection and recovery mechanism. The overhead incurred by our approach ranges from 0% to 18% for selected benchmarks. This is lower than comparable systems published in literature.
  • Keywords
    Benchmark testing; Clocks; Fault tolerance; Fault tolerant systems; Instruction sets; Libraries; Synchronization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013
  • Conference_Location
    Grenoble, France
  • ISSN
    1530-1591
  • Print_ISBN
    978-1-4673-5071-6
  • Type

    conf

  • DOI
    10.7873/DATE.2013.194
  • Filename
    6513640