• DocumentCode
    1961851
  • Title

    Adaptive fault tolerance through invasive computing

  • Author

    Witterauf, Michael ; Tanase, Alexandru ; Teich, Jurgen ; Lari, Vahid ; Zwinkau, Andreas ; Snelting, Gregor

  • Author_Institution
    Friedrich-Alexander-Univ. Erlangen-Nurnberg (FAU), Erlangen, Germany
  • fYear
    2015
  • fDate
    15-18 June 2015
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Fault tolerance is a basic necessity to make today´s complex systems reliable. Adequate fault tolerance, however, demands a high degree of redundancy, possibly wasting resources when the fault probability is low or when some applications do not require fault tolerance. Under the term adaptive fault tolerance, we investigate means to instead provide on-demand fault tolerance on multi-core systems dynamically and according to application and environmental needs. Such means are provided on a per-application basis by invasive computing, a recent paradigm for resource-aware programming and design of parallel systems: applications request resources in an invade phase, infect the acquired resources with code and data, and finally release them in a retreat phase. We show how to use these simple but powerful constructs to adaptively tolerate faults and that invasive computing harmonizes well with many existing fault tolerance approaches. Finally, a case study on adaptively providing fault tolerance for loops demonstrates how effective invasive computing is for adapting to a varying soft error rate and handling of faults.
  • Keywords
    fault tolerant computing; multiprocessing systems; parallel processing; adaptive fault tolerance; fault handling; invasive computing; multi-core systems; parallel systems design; redundancy degree; resource-aware programming; soft error rate; Adaptation models; Fault tolerant systems; Hardware; Redundancy; Runtime;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Adaptive Hardware and Systems (AHS), 2015 NASA/ESA Conference on
  • Conference_Location
    Montreal, QC
  • Type

    conf

  • DOI
    10.1109/AHS.2015.7231155
  • Filename
    7231155