Title :
Accelerating Bowtie2 with a lock-less concurrency approach and memory affinity
Author_Institution :
Comput. Sci. Dept., Univ. of Torino, Turin, Italy
Abstract :
The implementation of DNA alignment tools for Bioinformatics lead to face different problems that dip into performances. A single alignment take an amount of time that is not predictable and there are different factors that can affect performances, for example the length of sequences can determine the computational grain of the task and mismatches or insertion/deletion (indels) increase the amount of time needed to complete an alignment. Moreover, an alignment is a strong memory-bound problem because of the irregular memory access patterns and limitations in memory-bandwidth. Over the years, many alignment tools were implemented. A concrete example is Bowtie2, one of the fastest (concurrent, Pthread-based) and state of the art not GPU-based alignment tool. Bowtie2 exploits concurrency by instantiating a pool of threads, which have access to a global input dataset, share the reference genome and have access to different objects for collecting alignment results. In this paper a modified implementation of Bowtie2 is presented, in which the concurrency structure has been changed. The proposed implementation exploits the task-farm skeleton pattern implemented as a Master-Worker. The Master-Worker pattern permits to delegate only to the master thread dataset reading and to make private to each Worker data structures that are shared in the original version. Only the reference genome is left shared. As a further optimisation, the Master and each Worker were pinned on cores and the reference genome was allocated interleaved among memory nodes. The proposed implementation is able to gain up to 10 speedup points over the original implementation.
Keywords :
DNA; bioinformatics; concurrency control; data structures; genomics; multi-threading; DNA alignment tools; bioinformatics; computational grain; concurrent Pthread-based Bowtie2; global input dataset; irregular memory access patterns; lock-less concurrency approach; master thread dataset reading; master-worker pattern; memory affinity; memory nodes; memory-bandwidth; memory-bound problem; reference genome; sequence length; task-farm skeleton pattern; worker data structures; Bioinformatics; Data structures; Genomics; Instruction sets; Programming; Resource management; Skeleton; Bioinformatics; Genomics; Multithreading;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2014 22nd Euromicro International Conference on
Conference_Location :
Torino
DOI :
10.1109/PDP.2014.50