• DocumentCode
    1961271
  • Title

    Fault Localizing End-to-End Flow Control Protocol for Networks-on-Chip

  • Author

    Schley, Gert ; Batzolis, N. ; Radetzki, Martin

  • Author_Institution
    Embedded Syst. Eng. Group (ES), Univ. of Stuttgart Stuttgart, Stuttgart, Germany
  • fYear
    2013
  • fDate
    Feb. 27 2013-March 1 2013
  • Firstpage
    454
  • Lastpage
    461
  • Abstract
    A reliable data exchange between cores of a Network-on-Chip (NoC) is of great importance for correct system behavior. However, data exchange is aggravated by the occurrence of transient and permanent faults in the NoC´s communication structure (links). These faults may cause corruption or loss of data which in turn may lead to performance degradation or, in worst case, to complete system failure. In case data is corrupted by a transient fault, a common measure to handle this is to retransmit the data. To ensure that faulty data is retransmitted, so called flow control protocols are applied. In case of permanent faults a simple retransmission is not possible. Permanent faults in e.g. links lead to a permanent corruption of data as long as they are not located. Thus, even retransmissions get corrupted. In this paper we present a fault tolerant end-to-end protocol applicable to arbitrary NoC topologies. It ensures reliable end-to-end communication in presence of transient and permanent faults in the interconnection structure. By means of the protocol´s online diagnostic ability, it is capable of locating faulty links and switches without any additional diagnosis hardware.
  • Keywords
    electronic data interchange; failure analysis; fault diagnosis; fault tolerant computing; multiprocessor interconnection networks; network topology; network-on-chip; performance evaluation; protocols; NoC communication structure; arbitrary NoC topologies; diagnosis hardware; fault localizing end-to-end flow control protocol; fault tolerant end-to-end protocol; faulty data retransmission; interconnection structure; networks-on-chip; performance degradation; permanent data corruption; permanent faults; protocol online diagnostic ability; reliable data exchange; reliable end-to-end communication; system failure; transient faults; Buffer storage; Protocols; Receivers; Reliability; Routing; Software; Transient analysis; Fault Tolerance; Networks-on-Chip; Protocol;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on
  • Conference_Location
    Belfast
  • ISSN
    1066-6192
  • Print_ISBN
    978-1-4673-5321-2
  • Electronic_ISBN
    1066-6192
  • Type

    conf

  • DOI
    10.1109/PDP.2013.74
  • Filename
    6498590