• DocumentCode
    2501506
  • Title

    Fault diagnosis for sparsely interconnected multiprocessor systems

  • Author

    Blough, D.M. ; Sullivan, G.F. ; Masson, G.M.

  • Author_Institution
    Dept. of Electr. Eng., California Univ., Irvine, CA, USA
  • fYear
    1989
  • fDate
    21-23 June 1989
  • Firstpage
    62
  • Lastpage
    69
  • Abstract
    The authors present a general approach to fault diagnosis that is widely applicable and requires only a limited number of connections among units. Each unit in the system forms a private opinion on the status of each of its neighboring units based on duplication of jobs and comparison of job results over time. A diagnosis algorithm that consists of simply taking a majority vote among the neighbors of a unit to determine the status of that unit is then executed. The performance of this simple majority-vote diagnosis algorithm is analyzed using a probabilistic model for the faults in the system. It is shown that with high probability, for systems composed of n units, the algorithm will correctly identify the status of all units when each unit is connected to O(log n) other units. It is also shown that the algorithm works with high probability in a class of systems in which the average number of neighbors of a unit is constant. The results indicate that fault diagnosis can in fact be achieved quite simply in multiprocessor systems containing a low to moderate number of testing conditions.<>
  • Keywords
    automatic testing; computational complexity; fault tolerant computing; multiprocessing systems; comparison of job results; diagnosis by comparison; duplication of jobs; fault diagnosis; hypercube fault diagnosis; majority vote; majority-vote diagnosis algorithm; probabilistic model; sparsely interconnected multiprocessor systems; Algorithm design and analysis; Application software; Circuit faults; Computer science; Fault diagnosis; Hypercubes; Large-scale systems; Multiprocessing systems; System testing; Voting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fault-Tolerant Computing, 1989. FTCS-19. Digest of Papers., Nineteenth International Symposium on
  • Conference_Location
    Chicago, IL, USA
  • Print_ISBN
    0-8186-1959-7
  • Type

    conf

  • DOI
    10.1109/FTCS.1989.105544
  • Filename
    105544