• DocumentCode
    1959019
  • Title

    Responsive Fault-Tolerant Computing in the Era of Terascale Integration State of Art Report

  • Author

    Ezhilchelvan, Paul

  • Author_Institution
    Sch. of Comput. Sci., Newcastle Univ., Newcastle upon Tyne
  • fYear
    2008
  • fDate
    5-7 May 2008
  • Firstpage
    492
  • Lastpage
    496
  • Abstract
    Scaling in hardware integration process results in IC-process geometry reductions, lower operating voltages and increased clock speeds. This paper first surveys the reliability obstacles these developments give rise to and then points out that computing systems can no longer be safely assumed to fail only by crashing. Yet this assumption is at the core of primary-backup replication which the literature presents as the appropriate, and hence the most widely used, strategy for time-critical fault-tolerant applications. The paper then observes that building computing nodes with announced crash failure mode is a promising way forward to deal with the emerging reliability challenges. Work carried out to assure such a failure mode has also been briefly surveyed.
  • Keywords
    fault tolerant computing; integrated circuit reliability; IC-process geometry reductions; computing nodes; crash failure mode; hardware integration; primary-backup replication; reliability obstacles; responsive fault-tolerant computing; time-critical fault-tolerant applications; Art; Circuit faults; Clocks; Computer crashes; Distributed computing; Fault tolerance; Hardware; Microprocessors; Random access memory; Voltage; Announced Crashes; Crash Assumption; Hardware Integarion; Primary-Backup Replication; Soft Errors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Object Oriented Real-Time Distributed Computing (ISORC), 2008 11th IEEE International Symposium on
  • Conference_Location
    Orlando, FL
  • Print_ISBN
    978-0-7695-3132-8
  • Type

    conf

  • DOI
    10.1109/ISORC.2008.42
  • Filename
    4553326