Title :
Fault detection and recovery in a data-driven real-time multiprocessor
Author :
Farquhar, William G. ; Evripidou, Paraskevas
Author_Institution :
Dept. of Comput. Sci. & Eng., Southern Methodist Univ., Dallas, TX, USA
Abstract :
Introduces the mechanisms required to perform fault detection and recovery in the DART (Data-driven Architecture for Real-Time) multiprocessor architecture. The DART multiprocessor uses prioritized data-driven scheduling to ensure that multiple hard and soft deadlines are met. A data-driven checkpointing scheme has been developed that ensures that these deadlines are met even in the case of processor failures. The basic approach is to monitor the behavior of each computational thread by means of hardware timers. The results of a thread are released only if the thread completes before its given timeout period expires. Otherwise the partial computation on that processor is discarded and the thread is rescheduled on a different processor. A strategy to statically predict the system performance in the event of multiple processor failures is presented and evaluated. Simulation results are provided to illustrate the fault detection and recovery response times for single processor failures on DART multiprocessor architectures with 2, 4, 8, 16 and 32 processing elements
Keywords :
failure analysis; parallel architectures; performance evaluation; real-time systems; scheduling; system recovery; DART multiprocessor architecture; computational thread behaviour monitoring; data-driven checkpointing scheme; data-driven real-time multiprocessor; deadlines; fault detection; fault recovery; hardware timers; multiple processor failures; partial computation; prioritized data-driven scheduling; response times; simulation; static prediction; system performance; thread rescheduling; timeout period; Checkpointing; Computational modeling; Computer architecture; Condition monitoring; Delay; Fault detection; Hardware; Processor scheduling; System performance; Yarn;
Conference_Titel :
Parallel Processing Symposium, 1994. Proceedings., Eighth International
Conference_Location :
Cancun
Print_ISBN :
0-8186-5602-6
DOI :
10.1109/IPPS.1994.288217