Title :
Transient and Permanent Error Co-management Method for Reliable Networks-on-Chip
Author :
Yu, Qiaoyan ; Ampadu, Paul
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Rochester, Rochester, NY, USA
Abstract :
We propose a transient and permanent error co-management method for NoC links to achieve low latency, high throughput and high reliability, while maintaining energy efficiency. To reduce the energy overhead, a configurable error control coding adapts the number of redundant wires to the varying noise conditions, achieving different error detection capability. Infrequently used redundant wires are used as spare wires to replace broken links. Furthermore, a packet rebuilding/restoring algorithm that cooperates with a shortened error control coding method is proposed to support a low-latency splitting transmission. With this co-management method, we manage transient errors and a small number of permanent errors, without using extra spare wires, to reduce the need for adaptive routing. Simulation results show that the proposed method achieves up to 71% packet latency reduction and 20% throughput improvement, compared to previous methods. Case studies show that our method reduces the energy per packet by up to 68% and 48% for low and high permanent error conditions, respectively.
Keywords :
error statistics; integrated circuit noise; integrated circuit reliability; network-on-chip; NoC links; adaptive routing; broken links; configurable error control coding; energy efficiency; error comanagement method; error detection capability; low-latency splitting transmission; packet rebuilding algorithm; packet restoring algorithm; permanent errors; redundant wires; reliability; reliable networks-on-chip; shortened error control coding method; spare wires; transient errors; Automatic repeat request; Computer errors; Computer network reliability; Crosstalk; Delay; Error correction; Network-on-a-chip; Routing; Throughput; Wires; Network-on-chip; permanent error; reliability; spare wire; splitting transmission; transient error;
Conference_Titel :
Networks-on-Chip (NOCS), 2010 Fourth ACM/IEEE International Symposium on
Conference_Location :
Grenoble
Print_ISBN :
978-1-4244-7085-3
Electronic_ISBN :
978-1-4244-7086-0
DOI :
10.1109/NOCS.2010.24