Title :
Parallel self-diagnosis of large multiprocessor systems under the generalized comparison model
Author :
Abrougui, Kaouther ; Elhadef, Mourad
Author_Institution :
Sch. of Inf. Technol. & Eng., Ottawa Univ., Ont., Canada
Abstract :
This paper deals with the problem of self-diagnosis of multiprocessor and multicomputer systems. We consider the generalized comparison model in which jobs are assigned to pairs of nodes (processors) and the results are compared by the system´s nodes themselves (self-diagnosis). The agreements and disagreements among the nodes are the basis for identifying faulty nodes. Genetic algorithms (GAs) have been successfully used for identifying the set of faulty nodes in t-diagnosable systems, where the number of faulty nodes is bounded by t. The major drawback of such a technique is that it is time-consuming specially for large systems. In this paper, we describe a new parallel version of the existing evolutionary diagnosis method, which exploits competing sub-populations to speed up the diagnosis algorithm. Experimental results showed that the new parallel version considerably improved the response time of the diagnosis algorithm, hence, allowing faster identification of faulty nodes.
Keywords :
fault diagnosis; fault tolerant computing; genetic algorithms; multiprocessing systems; parallel processing; evolutionary diagnosis; faulty nodes; generalized comparison model; genetic algorithms; multicomputer systems; multiprocessor systems; parallel self-diagnosis; Delay; Fault detection; Fault diagnosis; Genetic algorithms; Information technology; Multiprocessing systems; Parallel processing; Performance evaluation; System testing;
Conference_Titel :
Parallel and Distributed Systems, 2005. Proceedings. 11th International Conference on
Print_ISBN :
0-7695-2281-5
DOI :
10.1109/ICPADS.2005.217