• DocumentCode
    1687973
  • Title

    Efficient software checking for fault tolerance

  • Author

    Yu, Jing ; Garzarán, María Jesús ; Snir, Marc

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL
  • fYear
    2008
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Dramatic increases in the number of transistors that can be integrated on a chip make processors more susceptible to radiation-induced transient errors. For commodity chips which are cost- and energy-constrained, software approaches can play a major role for fault detection because they can be tailored to fit different requirements of reliability and performance. However, software approaches add a significant performance overhead because they replicate the instructions and add checking instructions to compare the results. In order to make software checking approaches more attractive, we use compiler techniqes to identify the "unnecessary" replicas and checking instructions. In this paper, we present three techniques. The first technique uses boolean logic to identify code patterns that correspond to outcome tolerant branches. The second technique identifies address checks before loads and stores that can be removed with different degrees of fault coverage. The third technique identifies the checking instructions and shadow registers that are unnecessary when the register file is protected in hardware. By combining the three techniques, the overheads of software approaches can be reduced by an average 50%.
  • Keywords
    Boolean algebra; optimising compilers; program debugging; software fault tolerance; storage allocation; Boolean logic; compiler techniqes; fault tolerance; replicated optimized code; shadow registers; software checking instructions; Boolean functions; Computer errors; Error correction; Error correction codes; Fault diagnosis; Fault tolerance; Hardware; Logic; Registers; Software performance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
  • Conference_Location
    Miami, FL
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4244-1693-6
  • Electronic_ISBN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2008.4536435
  • Filename
    4536435