Title :
Towards on-chip fault-tolerant communication
Author :
Dumitras, Tudor ; Kerner, S. ; Marculescu, Radu
Author_Institution :
Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
As CMOS technology scales down into the deep-submicron (DSM) domain, devices and interconnects are subject to new types of malfunctions and failures that are harder to predict and avoid with the current system-on-chip (SoC) design methodologies. Relaxing the requirement of 100% correctness in operation drastically reduces the costs of design but, at the same time, requires SoCs be designed with some degree of system-level fault-tolerance. In this paper, we introduce a high-level model of DSM failure patterns and propose a new communication paradigm for SoCs, namely stochastic communication. Specifically, for a generic tile-based architecture, we propose a randomized algorithm which not only separates computation from communication, but also provides the required fault-tolerance to on-chip failures. This new technique is easy and cheap to implement in SoCs that integrate a large number of communicating IP cores.
Keywords :
CMOS integrated circuits; VLSI; failure analysis; fault tolerant computing; integrated circuit reliability; stochastic processes; system-on-chip; CMOS technology; DSM failure patterns; NoC architecture; SoC design; VLSI chips; communicating IP cores; deep submicron domain; failure model; generic tile-based architecture; high-level model; network-on-chip architecture; on-chip failures; on-chip fault-tolerant communication; performance metrics; randomized algorithm; stochastic communication; system-level fault tolerance; system-on-chip design; CMOS technology; Costs; Design automation; Design methodology; Fault tolerance; Integrated circuit interconnections; Network-on-a-chip; Protocols; Tiles; Very large scale integration;
Conference_Titel :
Design Automation Conference, 2003. Proceedings of the ASP-DAC 2003. Asia and South Pacific
Print_ISBN :
0-7803-7659-5
DOI :
10.1109/ASPDAC.2003.1195021