DocumentCode :
2721166
Title :
Comprehensive data infrastructure for plant bioinformatics
Author :
Jordan, Chris ; Stanzione, Dan ; Ware, Doreen ; Lu, Jerry ; Noutsos, Christos
Author_Institution :
Texas Adv. Comput. Center, Univ. of Texas at Austin, Austin, TX, USA
fYear :
2010
fDate :
20-24 Sept. 2010
Firstpage :
1
Lastpage :
5
Abstract :
The iPlant Collaborative is a 5-year, National Science Foundation-funded effort to develop cyberinfrastructure to address a series of grand challenges in plant science. The second of these grand challenges is the Genotype-to-Phenotype project, which seeks to provide tools, in the form of a web-based Discovery Environment, for understanding the developmental process from DNA to a full-grown plant. Addressing this challenge requires the integration of multiple data types that may be stored in multiple formats, with varying levels of standardization. Providing for reproducibility requires that detailed information documenting the experimental provenance of data, and the computational transformations applied to data once it is brought into the iPlant environment. Handling the large quantities of data involved in high-throughput sequencing and other experimental sources of bioinformatics data requires a robust infrastructure for storing and reusing large data objects. We describe the currently planned workflows to be developed for the Genotype-to-Phenotype discovery environment, the data types and formats that must be imported and manipulated within the environment, and we describe the data model that has been developed to express and exchange data within the Discovery Environment, along with the provenance model defined for capturing experimental source and digital transformation descriptions. Capabilities for interaction with reference databases are addressed, focusing not just on the ability to retrieve data from such data sources, but on the ability to use the iPlant Discovery Environment to further populate these important resources. Future activities and the challenges they will present to the data infrastructure of the iPlant Collaborative are also described.
Keywords :
bioinformatics; botany; data structures; comprehensive data infrastructure; digital transformation descriptions; genotype-to-phenotype project; iPlant collaborative; national science foundation; plant bioinformatics; reference databases; web based discovery environment; bioinformatics; data; gateways; metadata; provenance; standards;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), 2010 IEEE International Conference on
Conference_Location :
Heraklion, Crete
Print_ISBN :
978-1-4244-8395-2
Electronic_ISBN :
978-1-4244-8397-6
Type :
conf
DOI :
10.1109/CLUSTERWKSP.2010.5613093
Filename :
5613093
Link To Document :
بازگشت