• DocumentCode
    2593265
  • Title

    The fault tolerant parallel processor operating system concepts and performance measurement overview

  • Author

    Babikyan, Carol A.

  • Author_Institution
    Charles Stark Draper Lab. Inc., Cambridge, MA, USA
  • fYear
    1990
  • fDate
    15-18 Oct 1990
  • Firstpage
    366
  • Lastpage
    371
  • Abstract
    It is pointed out that mission critical applications of the future will require a computing system capable of high throughput as well as very high reliability. The fault tolerant parallel processor (FTPP), a system designed specifically to satisfy these goals, is described. The FTPP architecture consists of interconnection network/redundancy management hardware and standard commercial processors. The architecture provides flexibility in the appropriate balance of throughput and reliability for a given application. Furthermore, to maintain a system of high reliability the FTPP expeditiously identifies faulty components and performs some remedial operations. These redundancy management functions are performed by the operating system to relive the application from the knowledge of the underlying fault tolerance. How the operating system achieves redundancy management in conjunction with the fault tolerant hardware is described. Performance data to characterize system behavior are presented. Performance measurements indicate that the cost of fault tolerance does not significantly penalize forming redundancy management functions requires a mere .93 ms/frame more than a simplex processor performing no redundancy management
  • Keywords
    fault tolerant computing; parallel architectures; performance evaluation; redundancy; reliability; Byzantine resilience; cluster architecture; cost; digital avionics; fault tolerant parallel processor; flexibility; interconnection network/redundancy management; message handling; mission critical applications; mission critical system; parallel architecture; performance measurement; redundancy management; reliability; synchronisation; Computer architecture; Computer network management; Fault tolerant systems; Hardware; Maintenance; Mission critical systems; Multiprocessor interconnection networks; Operating systems; Redundancy; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Avionics Systems Conference, 1990. Proceedings., IEEE/AIAA/NASA 9th
  • Conference_Location
    Virginia Beach, VA
  • Type

    conf

  • DOI
    10.1109/DASC.1990.111316
  • Filename
    111316