DocumentCode :
2934346
Title :
In Situ Data Provenance Capture in Spreadsheets
Author :
Asuncion, Hazeline U.
Author_Institution :
Comput. & Software Syst., Univ. of Washington, Bothell, WA, USA
fYear :
2011
fDate :
5-8 Dec. 2011
Firstpage :
240
Lastpage :
247
Abstract :
The capture of data provenance is a fundamentally important task in eScience. While provenance can be captured using techniques such as scientific workflows, typically these techniques do not trace internal data manipulations that occur within off-the-shelf analysis tools. Yet it is still essential to capture data provenance within such environments. This paper discusses an in situ provenance approach for spreadsheet data in MS Excel, a commonly used analysis environment among scientists. We describe the design and implementation of an Excel tool that captures provenance unobtrusively in the background, allows for user annotations, provides undo/redo functionality at various levels of task granularity, and presents the captured provenance in an accessible format to support a range of provenance queries for analysis. We also present several motivating use case scenarios and a user evaluation which suggests that our approach is both efficient and useful to scientists.
Keywords :
natural sciences computing; spreadsheet programs; MS Excel; escience; in situ data provenance; off-the-shelf analysis tools; scientific workflows; spreadsheets; user annotations; Context; Data analysis; Data mining; Filtering; Noise; Semantics; data provenance; in situ capture; spreadsheets;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
E-Science (e-Science), 2011 IEEE 7th International Conference on
Conference_Location :
Stockholm
Print_ISBN :
978-1-4577-2163-2
Type :
conf
DOI :
10.1109/eScience.2011.41
Filename :
6123284
Link To Document :
بازگشت