• DocumentCode
    1497579
  • Title

    Fault Tolerant Network on Chip Switching With Graceful Performance Degradation

  • Author

    Kohler, Adán ; Schley, Gert ; Radetzki, Martin

  • Author_Institution
    Univ. of Stuttgart, Stuttgart, Germany
  • Volume
    29
  • Issue
    6
  • fYear
    2010
  • fDate
    6/1/2010 12:00:00 AM
  • Firstpage
    883
  • Lastpage
    896
  • Abstract
    The structural redundancy inherent to on-chip interconnection networks [networks on chip (NoC)] can be exploited by adaptive routing algorithms in order to provide connectivity even if network components are out of service due to faults, which will appear at an increasing rate with future chip technology nodes. This paper is based on a new, fine-grained functional fault model and a corresponding distributed fault diagnosis method that facilitate determining the fault status of individual NoC switches and their adjacent communication links. Whereas previous work on network fault-tolerance assume switches to be either available or fully out of service, we present a novel adaptive routing algorithm that employs the remaining functionality of partly defective switches. Using diagnostic information, transient faults are handled with a retransmission scheme that avoids the latency penalty of end-to-end repeat requests. Thereby, graceful degradation of NoC communication performance can be achieved even under high failure rates.
  • Keywords
    circuit reliability; circuit switching; fault diagnosis; fault tolerance; integrated circuit interconnections; network routing; network-on-chip; NoC; adaptive routing algorithm; distributed fault diagnosis; end-to-end repeat request; fault tolerance; fine-grained functional fault model; future chip technology; latency penalty; network on chip switching; on-chip interconnection networks; retransmission scheme; transient faults; Communication switching; Degradation; Delay; Fault diagnosis; Fault tolerance; Multiprocessor interconnection networks; Network-on-a-chip; Redundancy; Routing; Switches; Adaptive routing; graceful degradation; network fault tolerance; network on chip; online fault diagnosis;
  • fLanguage
    English
  • Journal_Title
    Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0278-0070
  • Type

    jour

  • DOI
    10.1109/TCAD.2010.2048399
  • Filename
    5467330