Title :
Reducing message overhead in TMR systems
Author :
Ramirez, John C. ; Melhem, Rami G.
Author_Institution :
Pittsburgh Univ., PA, USA
Abstract :
Traditional TMR protocols assume either single, reliable voters for each triple-modular redundant unit (TMRU) or triplicated voters (one for each processor) for each TMRU. In the first case a voter is a single point of failure for the system. In the second case, many physical messages must be sent across the communication network for each logical data item. We examine some protocols which attempt to maintain the functionality of the triplicated voter TMR protocol while reducing the number of physical messages required by one third. Possible solutions are examined to the many issues that result from this reduction in communication. Three different reduced-communication triple-modular redundant (RTMR) protocols are considered, each of which makes different assumptions about the nature of the underlying computation
Keywords :
fault tolerant computing; message passing; protocols; redundancy; TMR protocols; TMR systems; communication network; failure; logical data item; message overhead reduction; physical messages; reduced-communication triple-modular redundant protocols; single reliable voters; triple-modular redundant unit; triplicated voters; Circuit faults; Communication networks; Delay systems; Electrical fault detection; Error correction; Hardware; Protocols; Redundancy; Telecommunication network reliability; Timing;
Conference_Titel :
Distributed Computing Systems, 1999. Proceedings. 19th IEEE International Conference on
Conference_Location :
Austin, TX
Print_ISBN :
0-7695-0222-9
DOI :
10.1109/ICDCS.1999.776505