DocumentCode
1688916
Title
The virtual data grid: a new model and architecture for data-intensive collaboration
Author
Foster, Ian
Author_Institution
Argonne Nat. Lab., Chicago Univ., IL, USA
fYear
2003
Firstpage
11
Abstract
It is increasingly common to encounter communities engaged in the collaborative analysis and transformation of large quantities of data over extended periods of time. I argue that these communities require a scalable system for managing, tracing, exploring and communicating the derivation and analysis of diverse data objects. Such a system could bring significant productivity increases facilitating discovery, understanding, assessment, and sharing of both data and transformation resources for computation, storage, and collaboration. I define a model and architecture for a virtual data grid capable of addressing these requirements. I define a broadly applicable model of a "typed dataset" as the unit of derivation tracking, and simple constructs for describing how datasets are derived from transformations and from other datasets. I also define mechanisms for integrating with, and adapting to, existing data management systems and transformation and analysis tools, as well as grid mechanisms for distributed resource management and computation planning. Finally, I report on successful application results obtained with a prototype implementation called Chimera, involving challenging analysis of high-energy physics and astronomy data.
Keywords
astronomy computing; data analysis; data models; distributed databases; distributed processing; physics computing; statistical databases; Chimera; THSOM; analysis tool; astronomy; collaborative analysis; data analysis; data collaboration; data communication; data computation; data handling; data management; data model; data processing; data storage; data transformation; dataset; distributed database; distributed processing; distributed system; grid mechanism; natural science computing; physics; Astronomy; Collaboration; Computer architecture; Data analysis; Distributed computing; Grid computing; Physics; Productivity; Prototypes; Resource management;
fLanguage
English
Publisher
ieee
Conference_Titel
Scientific and Statistical Database Management, 2003. 15th International Conference on
ISSN
1099-3371
Print_ISBN
0-7695-1964-4
Type
conf
DOI
10.1109/SSDM.2003.1214945
Filename
1214945
Link To Document