DocumentCode :
3675983
Title :
Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery
Author :
Richard Grunzke;Alvaro Aguilera;Wolfgang E. Nagel; Krüger;Sonja Herres-Pawlis;Alexander Hoffmann;Sandra Gesing
Author_Institution :
Center for Inf. Services &
fYear :
2015
Firstpage :
371
Lastpage :
380
Abstract :
Distributed data life cycles consist of data sources, data and computing components as well as data sinks and user facing elements. The complexity of the underlying systems is ever rising with the increasing heterogeneity and distribution of components and environments. Researchers would like to focus on their specific research topic without the need to learn these systems in detail. Accessible data life cycles enable scientists to do better science more efficiently and obtain results, which would not have been possible without these advanced technologies. For this objective, abstraction to hide complexity and automation to avoid manual tasks are a necessity. These are embodied in the three conceptual data life cycle challenges, namely data, computing and utilization. Concepts and technologies to manage these challenges are explored and exemplified on the basis of the general data life cycle and MoSGrid (Molecular Simulation Grid) science gateway. In this context, we especially focus on teaching in drug design and quantum chemistry research use cases. Further cases are presented elucidating various challenges in adapting the concepts and technologies to wind energy data analysis and the XSEDE research infrastructure.
Keywords :
"Metadata","Complexity theory","Distributed databases","Chemistry","Drugs","Middleware","File systems"
Publisher :
ieee
Conference_Titel :
e-Science (e-Science), 2015 IEEE 11th International Conference on
Type :
conf
DOI :
10.1109/eScience.2015.72
Filename :
7304320
Link To Document :
بازگشت