Title :
FPGA acceleration of reference-based compression for genomic data
Author :
James Arram;Moritz Pflanzer;Thomas Kaplan;Wayne Luk
Author_Institution :
Department of Computing, Imperial College London, United Kingdom
Abstract :
One of the key challenges facing genomics today is efficiently storing the massive amounts of data generated by next-generation sequencing platforms. Reference-based compression is a popular strategy for reducing the size of genomic data, whereby sequence information is encoded as a mapping to a known reference sequence. Determining the mapping is a computationally intensive problem, and is the bottleneck of most reference-based compression tools currently available. This paper presents the first FPGA acceleration of reference-based compression for genomic data. We develop a new mapping algorithm based on the FM-index search operation which includes optimisations targeting the compression ratio and speed. Our hardware design is implemented on a Maxeler MPC-X2000 node comprising 8 Altera Stratix V FPGAs. When evaluated against compression tools currently available, our tool achieves a superior compression ratio, compression time, and energy consumption for both FASTA and FASTQ formats. For example, our tool achieves a 30% higher compression ratio and is 71.9 times faster than the fastqz tool.
Keywords :
"Genomics","Acceleration","Bioinformatics","Field programmable gate arrays","Indexes","Algorithm design and analysis","Optimization"
Conference_Titel :
Field Programmable Technology (FPT), 2015 International Conference on
DOI :
10.1109/FPT.2015.7393126