Title :
High-performance data management for genome sequencing centers using Globus Online: A case study
Author :
Sulakhe, D. ; Kettimuthu, Rajkumar ; Dave, Utsav
Author_Institution :
Argonne Nat. Lab., Univ. of Chicago, Chicago, IL, USA
Abstract :
In the past few years in the biomedical field, availability of low-cost sequencing methods in the form of next-generation sequencing has revolutionized the approaches life science researchers are undertaking in order to gain a better understanding of the causative factors of diseases. With biomedical researchers getting many of their patients´ DNA and RNA sequenced, sequencing centers are working with hundreds of researchers with terabytes to petabytes of data for each researcher. The unprecedented scale at which genomic sequence data is generated today by high-throughput technologies requires sophisticated and high-performance methods of data handling and management. For the most part, however, the state of the art is to use hard disks to ship the data. As data volumes reach tens or even hundreds of terabytes, such approaches become increasingly impractical. Data stored on portable media can be easily lost, and typically is not readily accessible to all members of the collaboration. In this paper, we discuss the application of Globus Online within a sequencing facility to address the data movement and management challenges that arise as a result of exponentially increasing amount of data being generated by a rapidly growing number of research groups. We also present the unique challenges in applying a Globus Online solution in sequencing center environments and how we overcome those challenges.
Keywords :
biology computing; data handling; DNA sequence; RNA sequence; biomedical field; biomedical researchers; case study; data management; data movement; data volumes; genome sequencing centers; globus online; globus online solution; high-performance data management; next-generation sequencing; Access control; Authentication; Educational institutions; File systems; Hard disks; Servers; Globus; Globus Online; GridFTP; cloud; data management; data transfer; grid; next-gen sequencing; sequencing center; translational medicine;
Conference_Titel :
E-Science (e-Science), 2012 IEEE 8th International Conference on
Conference_Location :
Chicago, IL
Print_ISBN :
978-1-4673-4467-8
DOI :
10.1109/eScience.2012.6404443