• DocumentCode
    3474922
  • Title

    A method for the construction and interpretation of high level models for distributed fault-tolerant systems

  • Author

    Tilly, K. ; Kiss, I. ; Roman, Graciela ; Dobrowiecki, T. ; Várkonyi-Kóczy, A.R.

  • Author_Institution
    Dept. of Meas. & Instrum. Eng., Budapest Tech. Univ., Hungary
  • fYear
    1995
  • fDate
    13-15 Sep 1995
  • Firstpage
    72
  • Lastpage
    81
  • Abstract
    Traditional solutions for achieving fault-tolerance are intended for use at design time and they generally capture system information at a very low (hardware or machine instruction) level. Increasing reliability of complex information systems containing many (perhaps many thousands) of autonomous components requires different solutions. This article presents a new methodology for the implementation of large scale, distributed fault-tolerant systems. System models are formed of objects describing requirements, services and resources organized into high level top-down hierarchical decomposition structures. Since redundancy is a natural property of any large scale system, by using such models it is possible to achieve fault tolerant behaviour by finding multiple appropriate mappings between requirements and available services, and to support the required services by available resources. The distributed system is extended with dedicated components, called diagnostic centres, which manage distinct parts of the system model, continuously observe the operation of the distributed system, and find alternative requirement-service mappings, if some services fail to fulfil their associated requirements. The elements and the structure of the proposed system modelling method are presented, an appropriate fault model is defined, and the algorithms for model interpretation are described
  • Keywords
    distributed processing; fault tolerant computing; autonomous components; complex information systems; design time; diagnostic centres; distributed fault-tolerant systems; high level models; high level top-down hierarchical decomposition structures; system information; system modelling method; Computer architecture; Distributed computing; Fault tolerance; Fault tolerant systems; Hardware; Information systems; Instruments; Large-scale systems; Redundancy; Reliability engineering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems, 1995. Proceedings., 14th Symposium on
  • Conference_Location
    Bad Neuenahr
  • ISSN
    1060-9857
  • Print_ISBN
    0-8186-7153-X
  • Type

    conf

  • DOI
    10.1109/RELDIS.1995.526215
  • Filename
    526215