Title :
Fast Burrows Wheeler Compression Using All-Cores
Author :
Aditya Deshpande;P.J. Narayanan
Author_Institution :
Univ. of Illinois at Urbana Champaign, Champaign, IL, USA
fDate :
5/1/2015 12:00:00 AM
Abstract :
In this paper, we present an all-core implementation of Burrows Wheeler Compression algorithm that exploits all computing resources on a system. Our focus is to provide significant benefit to everyday users on common end-to-end applications by exploiting the parallelism of multiple CPU cores and additional accelerators, viz. Many-core GPU, on their machines. The all-core framework is suitable for problems that process large files or buffers in blocks. We consider a system to be made up of compute stations and use a work-queue to dynamically divide the tasks among them. Each compute station uses an implementation that optimally exploits its architecture. We develop a fast GPU BWC algorithm by extending the state-of-the-art GPU string sort to efficiently perform BWT step of BWC. Our hybrid BWC with GPU acceleration achieves a 2.9× speedup over best CPU implementation. Our all-core framework allows concurrent processing of blocks by both GPU and all available CPU cores. We achieve a 3.06× speedup by using all CPU cores and a 4.87× speedup when we additionally use an accelerator i.e. GPU. Our approach will scale to the number and different types of computing resources or accelerators found on a system.
Keywords :
"Graphics processing units","Sorting","Multicore processing","Transforms","Optimization","Runtime","Instruction sets"
Conference_Titel :
Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International
DOI :
10.1109/IPDPSW.2015.53