• DocumentCode
    1219499
  • Title

    Hierarchical modeling of availability in distributed systems

  • Author

    Hariri, Salim ; Mutlu, Hasan

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Syracuse Univ., NY, USA
  • Volume
    21
  • Issue
    1
  • fYear
    1995
  • fDate
    1/1/1995 12:00:00 AM
  • Firstpage
    50
  • Lastpage
    56
  • Abstract
    Distributed computing systems are attractive due to the potential improvement in availability, fault-tolerance, performance, and resource sharing. Modeling and evaluation of such computing systems is an important step in the design process of distributed systems. We present a two-level hierarchical model to analyze the availability of distributed systems. At the higher level (user level), the availability of the tasks (processes) is analyzed using a graph-based approach. At the lower level (component level), detailed Markov models are developed to analyze the component availabilities. These models take into account the hardware/software failures, congestion and collisions in communication links, allocation of resources, and the redundancy level. A systematic approach is developed to apply the two-level hierarchical model to evaluate the availability of the processes and the services provided by a distributed computing environment. This approach is then applied to analyze some of the distributed processes of a real distributed system, Unified Workstation Environment (UWE), that is currently being implemented at AT&T Bell Laboratories
  • Keywords
    Markov processes; computer network reliability; distributed processing; fault tolerant computing; redundancy; reliability; Markov models; Unified Workstation Environment; availability; communication links; congestion; distributed computing environment; distributed system availability; fault-tolerance; graph-based approach; hierarchical modeling; redundancy level; resource sharing; two-level hierarchical model; Availability; Design engineering; Distributed computing; Fault tolerant systems; Fault trees; Reliability engineering; Resource management; Steady-state; Throughput; Time sharing computer systems;
  • fLanguage
    English
  • Journal_Title
    Software Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0098-5589
  • Type

    jour

  • DOI
    10.1109/32.341847
  • Filename
    341847