Title :
merAligner: A Fully Parallel Sequence Aligner
Author :
Georganas, Evangelos ; Buluc, Aydin ; Chapman, Jarrod ; Oliker, Leonid ; Rokhsar, Daniel ; Yelick, Katherine
Author_Institution :
Comput. Res. Div., Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
Abstract :
Aligning a set of query sequences to a set of target sequences is an important task in bioinformatics. In this work we present merAligner, a highly parallel sequence aligner that implements a seed -- and -- extend algorithm and employs parallelism in all of its components. MerAligner relies on a high performance distributed hash table (seed index) and uses one-sided communication capabilities of the Unified Parallel C to facilitate a fine-grained parallelism. We leverage communication optimizations at the construction of the distributed hash table and software caching schemes to reduce communication during the aligning phase. Additionally, merAligner preprocesses the target sequences to extract properties enabling exact sequence matching with minimal communication. Finally, we efficiently parallelize the I/O intensive phases and implement an effective load balancing scheme. Results show that merAligner exhibits efficient scaling up to thousands of cores on a Cray XC30 supercomputer using real human and wheat genome data while significantly outperforming existing parallel alignment tools.
Keywords :
C language; bioinformatics; cache storage; optimisation; parallel processing; resource allocation; Cray XC30 supercomputer; I/O intensive phases; aligning phase; bioinformatics; communication optimizations; communication reduction; fine-grained parallelism; high performance distributed hash table; load balancing scheme; merAligner; one-sided communication capabilities; parallel sequence aligner; query sequences; seed index; seed-and-extend algorithm; sequence matching; software caching schemes; unified parallel C; wheat genome data; Bioinformatics; Data structures; Genomics; Indexes; Load management; Optimization; Software;
Conference_Titel :
Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International
Conference_Location :
Hyderabad
DOI :
10.1109/IPDPS.2015.96