Title :
Fault Localization in NoCs Exploiting Periodic Heartbeat Messages in a Many-Core Environment
Author :
Garbade, A. ; Weis, Sebastian ; Schlingmann, S. ; Fechner, B. ; Ungerer, Theo
Author_Institution :
Univ. of Augsburg, Augsburg, Germany
Abstract :
This paper presents a novel fault localization approach for NoCs by leveraging so called timed heartbeat messages. While these messages are periodically sent to report health states of processor cores to a fault detection unit, information about the network health state (topology) can be extracted from their timing behavior. We show how this health state information can be easily extracted from the message arrival times and give an estimation of the expected costs for this technique.
Keywords :
fault diagnosis; multiprocessing systems; network-on-chip; NoCs; fault detection unit; fault localization approach; health state information; many-core environment; network health state; periodic heartbeat messages; processor cores; timed heartbeat messages; timing behavior; Fault detection; Heart beat; Monitoring; Network topology; Ports (Computers); Routing; Topology; Fault Localisation; Heartbeat Messages; Monitoring; Network-On-Chip;
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
Conference_Location :
Cambridge, MA
Print_ISBN :
978-0-7695-4979-8
DOI :
10.1109/IPDPSW.2013.150