DocumentCode :
289998
Title :
Fault-tolerance on regular decomposition grid applications
Author :
Silva, Luis Moura ; Silva, Joao Gabriel ; Chapple, Simon ; Clarke, Lyndon
Author_Institution :
Dept. de Engenharia Informatica, Coimbra Univ., Portugal
fYear :
1995
fDate :
25-27 Jan 1995
Firstpage :
358
Lastpage :
365
Abstract :
Writing parallel applications is considerably more complex due to additional problems not found in the sequential environment. The main problems include communication, synchronization data partitioning and distribution, mapping of processes, heterogeneity and fault tolerance. Fault tolerance is a very important feature in parallel/distributed systems since the mean time between failures of the system decreases with the number of processors, and the failure of just one process(or) can lead to the crash of the entire application. This paper presents an example of a parallel library (PUL-RD) that solves most of the problems pointed out before and provides support for fault tolerance. The original version of the library offers high-level support for parallelism in a portable way and can be used to write grid-based parallel applications which have a regular decomposition. In this paper, we will describe the fault-tolerance issues that were incorporated into the PUL-RD, giving special attention to the functionality of the checkpointing scheme
Keywords :
fault tolerant computing; software fault tolerance; synchronisation; PUL-RD; checkpointing; communication; distribution; fault-tolerance; heterogeneity; high-level support; regular decomposition grid applications; synchronization data partitioning; Application software; Computer crashes; Fault tolerance; Libraries; Parallel processing; Parallel programming; Programming profession; Software reusability; Utility programs; Writing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing, 1995. Proceedings. Euromicro Workshop on
Conference_Location :
San Remo
Print_ISBN :
0-8186-7031-2
Type :
conf
DOI :
10.1109/EMPDP.1995.389187
Filename :
389187
Link To Document :
بازگشت