DocumentCode :
1605267
Title :
Self-Recovering Parallel Applications in Multi-core Systems
Author :
Bizot, Gilles ; Avresky, Dimiter ; Chaix, Fabien ; Zergainoh, Nacer-Eddine ; Nicolaidis, Michael
Author_Institution :
TIMA Lab., INP, Grenoble, France
fYear :
2011
Firstpage :
51
Lastpage :
58
Abstract :
In this paper, a Self-Recovering strategy, which is able to "re-map" dynamically application tasks on a multi-core system, is presented. Based on run-time failure aware techniques, this Self-Recovering strategy guarantees seamlessly termination and delivering the expected results despite multiple node and link failures in a 2D mesh topology. It has been demonstrated, based on a statistical analysis, that the proposed technique is able to re-map the tasks of faulty nodes in a bounded number of steps. The theoretical results have been validated by simulations. The proposed technique is allowing to bypass multiple nodes, routers and links failures with a predictable number of hops. It has been demonstrated that the Motion JPEG-2000 application can be parallelized and formally represented as a Directed Acyclic Graph (DAG). It is worth noting that the proposed technique has been validated by the simulation of a 1000 cores system, in the presence of nodes and links failures up to 10%. Therefore, the proposed technique has been shown to be efficient for seamless execution of parallel streaming applications and to provide the Execution Time Reduction Ratio close to ideal.
Keywords :
multiprocessing systems; parallel processing; statistical analysis; 2D mesh topology; Motion JPEG-2000; directed acyclic graph; execution time reduction ratio; link failures; multicore systems; run-time failure aware techniques; self-recovering parallel applications; statistical analysis; Fault tolerance; Fault tolerant systems; Heuristic algorithms; Peer to peer computing; Routing; Search problems; Adaptive Fault-Tolerant Routing; Multi-Core Chip; Parallel Streaming Application; Seamless Execution; Self-Recovering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Network Computing and Applications (NCA), 2011 10th IEEE International Symposium on
Conference_Location :
Cambridge, MA
Print_ISBN :
978-1-4577-1052-0
Electronic_ISBN :
978-0-7695-4489-2
Type :
conf
DOI :
10.1109/NCA.2011.14
Filename :
6038584
Link To Document :
بازگشت