DocumentCode :
3469622
Title :
Locks and barriers in checkpointing and recovery
Author :
Badrinath, Ramamurthy ; Morin, Christine
Author_Institution :
CSE Dept., Indian Inst. of Technol., Kharagpur, India
fYear :
2004
fDate :
19-22 April 2004
Firstpage :
459
Lastpage :
466
Abstract :
Dependency tracking between communicating tasks is an important concept in backward error recovery for parallel applications. One can extend the traditional dependence tracking model for message passing systems to track dependencies between shared memory and task private states for shared memory applications. The objective of this paper is to analyze the issues generated by locks and barriers in parallel applications so that we can checkpoint tasks at any time (even when holding or waiting for locks and barriers). In particular we attempt to extend earlier dependency tracking mechanisms to locks and barriers. We address both coordinated and uncoordinated checkpointing schemes.
Keywords :
fault tolerant computing; message passing; parallel programming; shared memory systems; system recovery; workstation clusters; backward error recovery; barriers; communicating tasks; coordinated checkpointing; dependency tracking; locks; message passing systems; parallel applications; private states; shared memory; system recovery; uncoordinated checkpointing; Checkpointing; Context modeling; Fault detection; Fault tolerance; Fault tolerant systems; Grid computing; Hardware; Kernel; Message passing; Protocols;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing and the Grid, 2004. CCGrid 2004. IEEE International Symposium on
Print_ISBN :
0-7803-8430-X
Type :
conf
DOI :
10.1109/CCGrid.2004.1336601
Filename :
1336601
Link To Document :
بازگشت