• DocumentCode
    3131801
  • Title

    A Low-Cost Fault Tolerance Technique in Multi-media Applications through Configurability

  • Author

    Lanfang Tan ; Ying Tan

  • Author_Institution
    Nat. Lab. of Parallel & Distrib. Process., Changsha, China
  • fYear
    2013
  • fDate
    29-30 July 2013
  • Firstpage
    299
  • Lastpage
    304
  • Abstract
    As chip densities and clock rates increases, processors are becoming more susceptible to transient faults that affect program correctness. Therefore, fault tolerance becomes increasingly important in computing system. Two major concerns of fault tolerance techniques are: a) improving system reliability by detecting transient errors and b) reducing performance overhead. In this study, we propose a configurable fault tolerance technique targeting both high reliability and low performance overhead for multi-media applications. The basic principle is applying different levels of fault tolerance configurability, which means that different degrees of fault tolerance are applied to different parts of the source codes in multi-media applications. First, a primary analysis is performed on the source code level to classify the critical statements. Second, a fault injection process combined with a statistical analysis is used to assure the partition with regards to a confidence degree. Finally, checksum-based fault tolerance and instruction duplication are applied to critical statements, while no fault tolerance mechanism is applied to non-critical parts. Performance experiment results demonstrate that our configurable fault tolerance technique can lead to significant performance gains compared with duplicating all instructions. The fault coverage of this scheme is also evaluated. Fault injection results show that about 90% of outputs are application-level correctness with just 20% of runtime overhead.
  • Keywords
    multimedia systems; software fault tolerance; statistical analysis; application-level correctness; checksum-based fault tolerance; confidence degree; fault injection process; fault tolerance configurability; instruction duplication; low-cost fault tolerance technique; multimedia applications; performance overhead reduction; primary analysis; source code level; statistical analysis; system reliability; transient error detection; Benchmark testing; Fault tolerance; Fault tolerant systems; Multimedia communication; Registers; Statistical analysis; application-level correctness; checksum; configurable fault tolerance; critical segments; multi-media applications;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Quality Software (QSIC), 2013 13th International Conference on
  • Conference_Location
    Najing
  • Type

    conf

  • DOI
    10.1109/QSIC.2013.25
  • Filename
    6605943