Title :
Fault tolerance for future COTS hardware
Author :
Mukherjee, Shubu
Author_Institution :
Massachusetts Microprocessor Design Center, Intel Corp., Shrewsbury, MA, USA
Abstract :
Summary form only given. With each technology generation, we are experiencing an increased rate of cosmically-induced soft errors in our microprocessor chips. The advent of such high levels of soft error protection on microprocessors chips will necessitate revisiting the design of highly dependable systems, such as those used in space, built with COTS (commodity off-the-shelf) hardware. However, given that processors would internally deal with soft errors, it is not clear if highly dependable systems would need to additionally implement off-chip lockstepping to cope with permanent faults. Worse, implementing off-chip lockstepping at extremely high speeds, such as tens of gigahertz, is extremely difficult. Hence, designers may need to consider alternate techniques, such as processor sparing, to deal with permanent faults in processor chips used in mission-critical systems.
Keywords :
fault tolerant computing; microprocessor chips; commodity off-the-shelf hardware; cosmically-induced soft errors; fault tolerant computing; highly dependable systems; microprocessor chips; mission-critical systems; off-chip lockstepping; processor sparing; soft error protection; Costs; Error analysis; Error correction codes; Fault tolerance; Hardware; Microprocessor chips; Multithreading; Postal services; Protection; Radiation detectors;
Conference_Titel :
Dependable Computing, 2004. Proceedings. 10th IEEE Pacific Rim International Symposium on
Print_ISBN :
0-7695-2076-6
DOI :
10.1109/PRDC.2004.1276588