• DocumentCode
    3412995
  • Title

    A top-down approach to high-consequence fault analysis for software systems

  • Author

    Fronczak, Ed

  • Author_Institution
    Sandia Nat. Labs., Albuquerque, NM, USA
  • fYear
    35735
  • fDate
    2-5 Nov1997
  • Firstpage
    259
  • Abstract
    Summary form only given, as follows. Even if software code is fault-free, hardware failures can alter values in memory, possibly where the code itself is stored, causing a computer system to reach an unacceptable state. Microprocessor systems are used to perform many safety and security functions where a design goal is to eliminate single-point failures such as these. One design approach is to use multiple processors, compare the outputs, and assume a failure has occurred if the outputs don´t agree. In systems where the design is constrained to a single processor, however, analytical methods are needed to identify potential single-point failures at the bit level so that an effective fault-tolerant strategy can be employed. This paper describes a top-down methodology, based upon fault tree analysis, that has been used to identify potential high-consequence faults in microprocessor-based systems. The key to making the fault tree analysis tractable is to effectively incorporate appropriate design features such as software path control and checksums so that complicated branches of the fault tree can be terminated early. The analysis uses simplified software flow diagrams depicting relevant code elements. Pertinent sections of machine language are then examined to identify suspect hardware. A comparison of this methodology with approaches based upon failure modes and effects analysis (FMEA) is made. The methodology is demonstrated through a simple example. Use of fault trees to show that software code is free of safety or security faults is also demonstrated
  • Keywords
    fault trees; flowcharting; safety-critical software; software fault tolerance; system recovery; analytical methods; bit level; checksums; complicated branch termination; design features; failure modes and effect analysis; fault tree analysis; fault-free software code; fault-tolerant strategy; hardware failures; high-consequence fault analysis; machine language; memory values; microprocessor systems; safety functions; security functions; single-point failures; software flow diagrams; software path control; software systems; suspect hardware identification; top-down approach; tractability; unacceptable system state; Failure analysis; Fault diagnosis; Fault tolerant systems; Fault trees; Hardware; Laboratories; Microprocessors; Security; Software safety; Software systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Reliability Engineering, 1997. Proceedings., The Eighth International Symposium on
  • Conference_Location
    Albuquerque, NM
  • Print_ISBN
    0-8186-8120-9
  • Type

    conf

  • DOI
    10.1109/ISSRE.1997.630873
  • Filename
    630873