DocumentCode :
668079
Title :
Tackling Permanent Faults in the Network-on-Chip Router Pipeline
Author :
Poluri, Pavan ; Louri, Ahmed
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Arizona, Tucson, AZ, USA
fYear :
2013
fDate :
23-26 Oct. 2013
Firstpage :
49
Lastpage :
56
Abstract :
The proliferation of multi-core and many-core chips for performance scaling is making the Network-on-Chip (NoC) occupy a growing amount of silicon area spanning several metal layers. The NoC is neither immune to hard faults and transient faults nor unaffected by the adverse increase in hard faults caused by technology scaling. The ramifications for the NoC are immense: a single fault in the NoC may paralyze the working of the entire chip. To this end, we propose a Permanent Fault Tolerant Router (PFTR) that is capable of tolerating multiple permanent faults in the pipeline. PFTR is designed by making architectural modifications to individual pipeline stages of the baseline NoC router. These architectural modifications involve adding minimum extra circuitry and exploiting temporal parallelism to accomplish fault tolerance. Tolerance of multiple faults is achieved by striking a balance between three important design factors namely, area overhead, power overhead and reliability. We use Silicon Protection Factor (SPF) as the reliability metric to assess the reliability improvement of the proposed architecture. SPF takes into account the number of faults required to cause failure and the area overhead of the additional circuitry to evaluate reliability. SPF calculation reveals that the proposed PFTR is 11 times more reliable than the baseline NoC router. Synthesis results using Cadence Encounter RTL Compiler at 45nm technology show that the additional circuitry adds an area overhead of 31% and power overhead of 30% with respect to the baseline NoC router. PFTR provides much better reliability with much less overhead as compared to other fault tolerant routers such as BulletProof, Vicis and RoCo [15].
Keywords :
fault tolerance; multiprocessing systems; network-on-chip; telecommunication network reliability; telecommunication network routing; BulletProof; PFTR; RoCo; SPF calculation; Vicis; architectural modifications; baseline NoC router; cadence encounter RTL compiler; circuitry; fault tolerance; fault tolerant routers; many-core chips; multicore chips; network on chip router pipeline; permanent fault tolerant router; permanent faults; power overhead; reliability metric; silicon protection factor; technology scaling; transient faults; Circuit faults; Fault tolerance; Fault tolerant systems; Multiplexing; Pipelines; Ports (Computers); Switches; Area; Latency; Network-on-Chip; Power; Reliability; Router Architecture;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Architecture and High Performance Computing (SBAC-PAD), 2013 25th International Symposium on
Conference_Location :
Porto de Galinhas
Print_ISBN :
978-1-4799-2927-6
Type :
conf
DOI :
10.1109/SBAC-PAD.2013.32
Filename :
6702579
Link To Document :
بازگشت