DocumentCode
1589687
Title
Optimized Rollback and Re-computation
Author
Lakhani, Hasnain ; Tahir, Rashid ; Aqil, Azeem ; Zaffar, Fareed ; Tariq, Dawood ; Gehani, Ashish
fYear
2013
Firstpage
4930
Lastpage
4937
Abstract
Large data processing tasks can be effected using workflow management systems. When either the input data or the programs in the pipeline are modified, the workflow must be re-executed to ensure that the final output data is updated to reflect the changes. Since such re-computation can consume substantial resources, optimizing the system to avoid redundant computation is desirable. In the case of a workflow, the dependency relationships between files are specified at the outset and can be leveraged to track which programs need to be re-executed when particular files change. Current distributed systems cannot provide such functionality when no predefined workflows exist. In this paper, we present an architecture that provides functionality to produce both correct output as well as fast re-execution by leveraging the provenance of data to propagate changes along an implicit dependency graph. We explore the tradeoff between storage and availability by presenting a performance analysis of our rollback and re-execution scheme.
Keywords
Computational modeling; Context; Data models; History; Performance analysis; Pipelines; Process control; data provenance; error recovery; scientific workflows;
fLanguage
English
Publisher
ieee
Conference_Titel
System Sciences (HICSS), 2013 46th Hawaii International Conference on
Conference_Location
Wailea, HI, USA
ISSN
1530-1605
Print_ISBN
978-1-4673-5933-7
Electronic_ISBN
1530-1605
Type
conf
DOI
10.1109/HICSS.2013.434
Filename
6480439
Link To Document