DocumentCode
1961271
Title
Fault Localizing End-to-End Flow Control Protocol for Networks-on-Chip
Author
Schley, Gert ; Batzolis, N. ; Radetzki, Martin
Author_Institution
Embedded Syst. Eng. Group (ES), Univ. of Stuttgart Stuttgart, Stuttgart, Germany
fYear
2013
fDate
Feb. 27 2013-March 1 2013
Firstpage
454
Lastpage
461
Abstract
A reliable data exchange between cores of a Network-on-Chip (NoC) is of great importance for correct system behavior. However, data exchange is aggravated by the occurrence of transient and permanent faults in the NoC´s communication structure (links). These faults may cause corruption or loss of data which in turn may lead to performance degradation or, in worst case, to complete system failure. In case data is corrupted by a transient fault, a common measure to handle this is to retransmit the data. To ensure that faulty data is retransmitted, so called flow control protocols are applied. In case of permanent faults a simple retransmission is not possible. Permanent faults in e.g. links lead to a permanent corruption of data as long as they are not located. Thus, even retransmissions get corrupted. In this paper we present a fault tolerant end-to-end protocol applicable to arbitrary NoC topologies. It ensures reliable end-to-end communication in presence of transient and permanent faults in the interconnection structure. By means of the protocol´s online diagnostic ability, it is capable of locating faulty links and switches without any additional diagnosis hardware.
Keywords
electronic data interchange; failure analysis; fault diagnosis; fault tolerant computing; multiprocessor interconnection networks; network topology; network-on-chip; performance evaluation; protocols; NoC communication structure; arbitrary NoC topologies; diagnosis hardware; fault localizing end-to-end flow control protocol; fault tolerant end-to-end protocol; faulty data retransmission; interconnection structure; networks-on-chip; performance degradation; permanent data corruption; permanent faults; protocol online diagnostic ability; reliable data exchange; reliable end-to-end communication; system failure; transient faults; Buffer storage; Protocols; Receivers; Reliability; Routing; Software; Transient analysis; Fault Tolerance; Networks-on-Chip; Protocol;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on
Conference_Location
Belfast
ISSN
1066-6192
Print_ISBN
978-1-4673-5321-2
Electronic_ISBN
1066-6192
Type
conf
DOI
10.1109/PDP.2013.74
Filename
6498590
Link To Document