Title :
A Transient-Resilient System-on-a-Chip Architecture with Support for On-Chip and Off-Chip TMR
Author :
Obermaisser, R. ; Kraut, H. ; Salloum, C.
Author_Institution :
Vienna Univ. of Technol., Vienna
Abstract :
The ongoing technological advances in the semiconductor industry make Multi-Processor System-on-a-Chips (MPSoCs) more attractive, because uniprocessor solutions do not scale satisfactorily with increasing transistor counts. In conjunction with the increasing rates of transient faults in logic and memory associated with the continuous reduction of feature sizes, this situation creates the need for novel MP- SoC architectures. This paper introduces such an architecture, which supports the integration of multiple, heterogeneous IP cores that are interconnected by a time-triggered Network-on-a-Chip (NoC). Through its inherent fault isolation and determinism, the proposed MPSoC provides the basis for fault tolerance using Triple Modular Redundancy (TMR). On-chip TMR improves the reliability of a MPSoC, e.g., by tolerating a transient fault in one of three replicated IP cores. Off-chip TMR with three MPSoCs can be used in the development of ultra-dependable applications (e.g., X-by-wire), where the reliability requirements exceed the reliability that is achievable using a single MPSoC. The paper quantifies the reliability benefits of the proposed MPSoC architecture by means of reliability modeling. These results demonstrate that the combination of on-chip and off- chip TMR contributes towards building more dependable distributed embedded real-time systems.
Keywords :
fault tolerance; integrated circuit design; integrated circuit reliability; integrated circuit testing; logic design; logic testing; network-on-chip; MPSoC architectures; NoC; fault isolation; fault tolerance; heterogeneous IP cores; multiprocessor system-on-a-chips; off-chip TMR; on-chip TMR; semiconductor industry; time-triggered network-on-a-chip; transient-resilient system-on-a-chip architecture; triple modular redundancy; Circuit faults; Computer architecture; Fault tolerance; Logic circuits; Network-on-a-chip; Real time systems; Redundancy; System-on-a-chip; Voting; Yarn; TMR; dependability; real-time systems; system architectures;
Conference_Titel :
Dependable Computing Conference, 2008. EDCC 2008. Seventh European
Conference_Location :
Kaunas
Print_ISBN :
978-0-7695-3138-0
DOI :
10.1109/EDCC-7.2008.20