• DocumentCode
    245535
  • Title

    Implicit intermittent fault detection in distributed systems

  • Author

    Waszecki, Peter ; Kauer, Matthias ; Lukasiewycz, Martin ; Chakraborty, Shiladri

  • Author_Institution
    CREATE, TUM, Singapore, Singapore
  • fYear
    2014
  • fDate
    20-23 Jan. 2014
  • Firstpage
    646
  • Lastpage
    651
  • Abstract
    This paper presents a novel approach to detect resources in distributed systems with an increased occurrence of intermittent faults that exceed the amount of unavoidable transient faults caused by environmental phenomena. Intermittent faults occur due to stressed resources and often are a precursor of permanent faults. The proposed early fault detection and diagnosis allows the use of precautionary measures before the permanent failure of a component in a distributed system occurs. In this paper, we present four methods that can implicitly detect intermittent faults by taking the distributed applications and their dependencies into account. Thus, explicit tests are not required which would lead to additional costs and resource load. On the other hand, the implicit approach may considerably reduce the number of plausibility tests compared to the conservative solution with one test per resource. We analyzed and evaluated implementations of the proposed fault detection principle. The experimental results give evidence of the feasibility of our approach and show a comparison of the implemented methods in terms of runtime and detection rate.
  • Keywords
    distributed processing; fault diagnosis; software reliability; distributed systems; fault detection principle; fault diagnosis; implicit intermittent fault detection; permanent faults; precautionary measures; unavoidable transient faults; Equations; Fault detection; Mathematical model; Reliability; Runtime; Transient analysis; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Design Automation Conference (ASP-DAC), 2014 19th Asia and South Pacific
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/ASPDAC.2014.6742964
  • Filename
    6742964