• DocumentCode
    123049
  • Title

    Runtime fault recovery protocol for NoC-based MPSoCs

  • Author

    Wachter, Eduardo ; Erichsen, Augusto ; Juracy, Leonardo ; Amory, Alexandre ; Moraes, Filipe

  • Author_Institution
    Fac. of Inf., Santa Cruz do Sul Univ. (UNISC), Santa Cruz, Brazil
  • fYear
    2014
  • fDate
    3-5 March 2014
  • Firstpage
    132
  • Lastpage
    139
  • Abstract
    The design of reliable MPSoCs is mandatory to cope with faults during fabrication or product lifetime. For instance, permanent faults on the interconnect network can stall or crash applications even though the network has alternative fault-free paths to a given destination. This paper presents a novel fault-tolerant communication protocol that takes advantage of the NoC parallelism to provide alternative paths between any source-target pair of processors, even in the presence of multiple faults. At the application layer, the method is seen as a typical MPI-like message passing protocol. At the lower layers, the method consists of a software kernel layer that monitors the regularity of message exchanges between pairs of tasks. If a message is not delivered in a certain time, the software fires a path finding mechanism implemented in hardware, which guarantees complete network reachability. The proposed approach determines new paths quickly, and the costs of extra silicon area and memory usage are small.
  • Keywords
    fault tolerance; integrated circuit interconnections; microprocessor chips; network-on-chip; MPI-like message passing protocol; NoC-based MPSoC; application layer; fault-free paths; fault-tolerant communication protocol; interconnect network; memory usage; message exchanges; multiprocessor system-on-chip; network reachability; network-on-chip parallelism; path finding mechanism; permanent faults; product lifetime; runtime fault recovery protocol; software kernel layer; Fault tolerance; Fault tolerant systems; Hardware; Ports (Computers); Program processors; Protocols; Routing; NoC-based MPSoC; fault-tolerant communication protocol;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Quality Electronic Design (ISQED), 2014 15th International Symposium on
  • Conference_Location
    Santa Clara, CA
  • Print_ISBN
    978-1-4799-3945-9
  • Type

    conf

  • DOI
    10.1109/ISQED.2014.6783316
  • Filename
    6783316