DocumentCode
1959019
Title
Responsive Fault-Tolerant Computing in the Era of Terascale Integration State of Art Report
Author
Ezhilchelvan, Paul
Author_Institution
Sch. of Comput. Sci., Newcastle Univ., Newcastle upon Tyne
fYear
2008
fDate
5-7 May 2008
Firstpage
492
Lastpage
496
Abstract
Scaling in hardware integration process results in IC-process geometry reductions, lower operating voltages and increased clock speeds. This paper first surveys the reliability obstacles these developments give rise to and then points out that computing systems can no longer be safely assumed to fail only by crashing. Yet this assumption is at the core of primary-backup replication which the literature presents as the appropriate, and hence the most widely used, strategy for time-critical fault-tolerant applications. The paper then observes that building computing nodes with announced crash failure mode is a promising way forward to deal with the emerging reliability challenges. Work carried out to assure such a failure mode has also been briefly surveyed.
Keywords
fault tolerant computing; integrated circuit reliability; IC-process geometry reductions; computing nodes; crash failure mode; hardware integration; primary-backup replication; reliability obstacles; responsive fault-tolerant computing; time-critical fault-tolerant applications; Art; Circuit faults; Clocks; Computer crashes; Distributed computing; Fault tolerance; Hardware; Microprocessors; Random access memory; Voltage; Announced Crashes; Crash Assumption; Hardware Integarion; Primary-Backup Replication; Soft Errors;
fLanguage
English
Publisher
ieee
Conference_Titel
Object Oriented Real-Time Distributed Computing (ISORC), 2008 11th IEEE International Symposium on
Conference_Location
Orlando, FL
Print_ISBN
978-0-7695-3132-8
Type
conf
DOI
10.1109/ISORC.2008.42
Filename
4553326
Link To Document