DocumentCode :
1786872
Title :
SHiFA: System-level hierarchy in run-time fault-aware management of many-core systems
Author :
Fattah, Mohammad ; Palesi, Maurizio ; Liljeberg, Pasi ; Plosila, Juha ; Tenhunen, Hannu
Author_Institution :
Univ. of Turku, Turku, Finland
fYear :
2014
fDate :
1-5 June 2014
Firstpage :
1
Lastpage :
6
Abstract :
A system-level approach to fault-aware resource management of many-core systems is proposed. The proposed approach, called SHiFA, is able to tolerate run-time faults at system level without any hardware overhead. In contrast to the existing system-level methods, network resources are also considered to be potentially faulty. Accordingly, applications are mapped onto healthy nodes of the system at run-time such that their interaction will not require the use of faulty elements. By utilizing the simple routing approach, results show 100% utilizability of PEs and 99.41% of successful mapping when up to 8 links are broken. SHiFA design is based on distributed operating systems, such that it is kept scalable for future many-core systems. A significant improvement in scalability properties is observed compared to the state-of-the-art distributed approaches.
Keywords :
fault tolerant computing; multiprocessing systems; operating systems (computers); resource allocation; SHiFA design; distributed operating systems; fault-aware resource management; many-core systems; network resources; routing approach; run-time fault-aware management; system-level hierarchy; Circuit faults; Fault tolerance; Fault tolerant systems; Kernel; Mobile communication; Resource management; Routing; application mapping; hierarchical management; system-level design;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Design Automation Conference (DAC), 2014 51st ACM/EDAC/IEEE
Conference_Location :
San Francisco, CA
Type :
conf
DOI :
10.1145/2593069.2593214
Filename :
6881428
Link To Document :
بازگشت