Author_Institution :
Shandong Key Lab. of Energy Genetics, Qingdao Inst. of Bioenergy & Bioprocess Technol., Qingdao, China
Abstract :
With the development of next-generation sequencing and metagenomic technologies, the number of metagenomic samples of microbial communities is increasing with exponential speed. The comparison among metagenomic samples could facilitate the data mining of the valuable yet hidden biological information held in the massive metagenomic data. However, current methods for metagenomic comparison are limited by their ability to process very large number of samples each with large data size. In this work, we have developed an optimized GPU-based metagenomic comparison algorithm, GPU-Meta-Storms, to evaluate the quantitative phylogenetic similarity among massive metagenomic samples, and implemented it using CUDA (Compute Unified Device Architecture) and C++ programming. The GPU-Meta-Storms program is optimized for CUDA with non-recursive transform, register recycle, memory alignment and so on. Our results have shown that with the optimization of the phylogenetic comparison algorithm, memory accessing strategy and parallelization mechanism on many-core hardware architecture, GPU-Meta-Storms could compute the pair-wise similarity matrix for 1920 metagenomic samples in 4 minutes, which gained a speed-up of more than 1000 times compared to CPU version Meta-Storms on single-core CPU, and more than 100 times on 16-core CPU. Therefore, the high-performance of GPU-Meta-Storms in comparison with massive metagenomic samples could thus enable in-depth data mining from massive metagenomic data, and make the real-time analysis and monitoring of constantly-changing metagenomic samples possible.
Keywords :
C++ language; biology computing; data mining; evolution (biological); genetics; genomics; graphics processing units; meta data; microorganisms; parallel architectures; 16-core CPU; C++ programming; CPU version Meta-Storms; CUDA; GPU-Meta-Storms program; compute unified device architecture; data size; exponential speed; hidden biological information; in-depth data mining; many-core hardware architecture; massive microbial communities; memory accessing strategy; memory alignment; metagenomic sample monitoring; metagenomic technologies; next-generation sequencing; nonrecursive transform; optimized GPU-based metagenomic comparison algorithm; pair-wise similarity matrix; parallelization mechanism; phylogenetic comparison algorithm; quantitative phylogenetic similarity; real-time analysis; register recycle; single-core CPU; time 4 min; Communities; Graphics processing units; Monitoring; Next generation networking; Optimization; Phylogeny; Programming; GPU; High Performance Computing; Metagenome; Phylogenetic Comparison;