DocumentCode :
659574
Title :
A scalable data analysis platform for metagenomics
Author :
Wei Tang ; Wilkening, Jared ; Desai, Narayan ; Gerlach, Wolfgang ; Wilke, Andreas ; Meyer, Folker
Author_Institution :
Argonne Nat. Lab., Argonne, IL, USA
fYear :
2013
fDate :
6-9 Oct. 2013
Firstpage :
21
Lastpage :
26
Abstract :
With the advent of high-throughput DNA sequencing technology, the analysis and management of the increasing amount of biological sequence data has become a bottleneck for scientific progress. For example, MG-RAST, a metagenome annotation system serving a large scientific community worldwide, has experienced a sustained, exponential growth in data submissions for several years; and this trend is expected to continue. To address the computational challenges posed by this workload, we developed a new data analysis platform, including a data management system (Shock) for biological sequence data and a workflow management system (AWE) supporting scalable, fault-tolerant task and resource management. Shock and AWE can be used to build a scalable and reproducible data analysis infrastructure for upper-level biological data analysis services.
Keywords :
DNA; biology computing; data analysis; genomics; AWE; MG-RAST; Shock; biological sequence data; data analysis platform; data management system; data submissions; high-throughput DNA sequencing technology; metagenome annotation system; metagenomics; scalable data analysis platform; scientific progress; upper-level biological data analysis services; workflow management system; Bioinformatics; Data analysis; Electric shock; Pipelines; Servers; Throughput; bioinformatics; cloud computing; data analysis platform; data management system; metagenomics; workflow;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
Type :
conf
DOI :
10.1109/BigData.2013.6691723
Filename :
6691723
Link To Document :
بازگشت