• DocumentCode
    121201
  • Title

    The DeSyRe Runtime Support for Fault-Tolerant Embedded MPSoCs

  • Author

    Pnevmatikatos, Dionisios ; Pnevmatikatos, Dionisios ; Tzilis, Stavros ; Sourdis, Ioannis

  • Author_Institution
    Comput. Archit. & VLSI Syst. Lab., Inst. of Comput. Sci. Found. for Res. & Technol. - Hellas, Heraklion, Greece
  • fYear
    2014
  • fDate
    26-28 Aug. 2014
  • Firstpage
    197
  • Lastpage
    204
  • Abstract
    Semiconductor technology scaling makes chips more sensitive to faults. This paper describes the DeSyRe design approach and its runtime management for future reliable embedded Multiprocessor Systems-on-Chip (MPSoCs). A light weight runtime system is described for shared-memory MPSoCs to support fault-tolerant execution upon detection of transient and permanent faults. The DeSyRe runtime system offers re-execution of tasks that suffer from transient faults and task-migration in cases where a worker processor is permanently faulty. In addition, a faulty worker can potentially remain usable, increasing systems fault-tolerance. This is achieved using alternative task implementations, which avoid the faulty circuit and are indicated in the application-code via pragma annotations, as well as by repairing a faulty core via hardware reconfiguration. Thereby, the system can be dynamically adapted using one or multiple of the above mechanisms to mitigate faults. The DeSyRe runtime system is evaluated using micro-benchmarks running on a Virtex-6 FPGA MPSoC. Results suggest that our enhance default-tolerant runtime system can successfully and efficiently execute all application tasks under a variety of fault cases.
  • Keywords
    embedded systems; fault tolerant computing; field programmable gate arrays; multiprocessing systems; system-on-chip; DeSyRe design approach; DeSyRe runtime support; Virtex-6 FPGA MPSoC; fault-tolerant embedded MPSoC; field programmable gate array; multiprocessing system-on-chip; permanent fault detection; semiconductor technology scaling; shared-memory MPSoC; transient fault detection; Circuit faults; Fault tolerance; Fault tolerant systems; Hardware; Runtime; System-on-chip; Transient analysis; FPGAs; embedded MPSoCs; runtime support;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing with Applications (ISPA), 2014 IEEE International Symposium on
  • Conference_Location
    Milan
  • Type

    conf

  • DOI
    10.1109/ISPA.2014.34
  • Filename
    6924447