• DocumentCode
    3354848
  • Title

    Efficient utilization of spare capacity for fault detection and location in multiprocessor systems

  • Author

    Tridandapani, S. ; Somani, A.K.

  • Author_Institution
    Washington Univ., Seattle, WA, USA
  • fYear
    1992
  • fDate
    8-10 July 1992
  • Firstpage
    440
  • Lastpage
    447
  • Abstract
    One scheme for detecting faults at the processor level in a multiprocessor system (see A. Dahbura et al., 1989) works by running secondary versions of jobs on the unused, or spare, processors of the system. The authors build upon this scheme and propose three new multiprocessor allocation strategies that run a viable number of versions per job. These schemes permit online detection and, in some cases, location of faulty processors in a system, without degrading its delay/throughput performance. Two new metrics, the fault detection capability and the fault location capability, are introduced to evaluate these schemes. Extensive simulation results are provided to show that these schemes utilize spare capacity more efficiently, thereby improving upon the fault detection and location capabilities of the system.<>
  • Keywords
    computer testing; fault location; fault tolerant computing; multiprocessing systems; resource allocation; fault detection; fault detection capability; fault location; fault location capability; multiprocessor allocation strategies; multiprocessor system; multiprocessor systems; spare capacity; Acceleration; Computational modeling; Computer science; Degradation; Electrical fault detection; Fault detection; Fault location; Hardware; Multiprocessing systems; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fault-Tolerant Computing, 1992. FTCS-22. Digest of Papers., Twenty-Second International Symposium on
  • Conference_Location
    Boston, MA, USA
  • Print_ISBN
    0-8186-2875-8
  • Type

    conf

  • DOI
    10.1109/FTCS.1992.243591
  • Filename
    243591