Title :
Distributed online software monitoring of manycore architectures
Author :
Faure, Etienne ; Benabdenbi, Mounir ; Pêcheux, Francois
Author_Institution :
LIP6-SoC Lab., Univ. Pierre & Marie Curie, Paris, France
Abstract :
This paper describes the design principles of a software based on-line testing application used to monitor manycore architectures running multi thread functional applications. The key idea is to have a non intrusive monitoring application running in parallel with the functional one. The monitoring application aims at detecting and reacting to software or hardware malfunctions, and can be seen as a service provided by the operating system. This monitoring method relies on the use of embedded sensors that capture physical values (temperature, ...) from the chip, or software-related indicators like CPU load. A case-study implementing this methodology has been performed and results in terms of memory usage and performance overhead are given.
Keywords :
computer architecture; computer debugging; multi-threading; multiprocessing systems; program testing; distributed online software monitoring; hardware malfunction; manycore architecture; manycore architecture monitoring; multithread function; nonintrusive monitoring application; software based online testing; software malfunction; Hardware; Monitoring; Program processors; Radiation detectors; Temperature sensors; Testing;
Conference_Titel :
On-Line Testing Symposium (IOLTS), 2010 IEEE 16th International
Conference_Location :
Corfu
Print_ISBN :
978-1-4244-7724-1
DOI :
10.1109/IOLTS.2010.5560232