Title :
Scientific lineage and object-based storage systems
Author :
Todd, Steve ; Hushon, Dan
Author_Institution :
EMC Corp., MA, USA
Abstract :
The lineage of scientific data refers to the linkage of a data set with the input and algorithms used to generate it. The input data, the algorithms, and the output data can be represented by nodes in a lineage graph; the child node (the output data) is connected by uni-directional arcs to the parent nodes (the inputs and the algorithm). Lineage graphs provide reproducibility as well as navigation back to original inputs and algorithms. Storage system technologies can be tremendously helpful in the storage and management of data lineage information. Recent developments in the storage industry can assist in the creation of lineage graphs. Object-addressable storage (OAS) systems can unify data with its lineage; the eXtensible Access Method (XAM) can serve as an industry standard access method for manipulating these united objects. Object-addressable storage systems can be mounted as cloud storage devices. These devices are capable of providing lineage functionality to provenance-aware applications.
Keywords :
data analysis; information management; scientific information systems; storage management; OAS system; cloud storage device; data lineage information management; data lineage information storage; eXtensible Access Method; industry standard access method; lineage functionality; lineage graph; object-addressable storage system; object-based storage system; provenance-aware application; scientific data lineage; Clouds; Couplings; Electromagnetic compatibility; File systems; Image databases; Navigation; Pediatrics; Reproducibility of results; Technology management; Writing;
Conference_Titel :
E-Science Workshops, 2009 5th IEEE International Conference on
Conference_Location :
Oxford
Print_ISBN :
978-1-4244-5946-9
DOI :
10.1109/ESCIW.2009.5408005