Title :
Compression of Quality Factors in Next Generation Sequencing
Author :
Nalbantoglu, O.U. ; Sayood, K.
Author_Institution :
Dept. of Electr. Eng., Univ. of Nebraska, Lincoln, NE, USA
Abstract :
We propose a compression algorithm for the quality scores contained in FASTQ files which are generated in large volumes during high throughput sequencing. The proposed algorithm is a context dependent arithmetic coder which is based on observations of the structure of quality scores in FASTQ files. Simulation results indicate a significantly superior performance of the algorithm to the current state of the art.
Keywords :
Q-factor; arithmetic codes; data compression; FASTQ files; compression algorithm; context dependent arithmetic coder; high throughput sequencing; next generation sequencing; quality factors compression; quality scores; Context; Data compression; Educational institutions; Electrical engineering; Next generation networking; Q-factor; Sequential analysis; Biological sequence compression; DNA; Quality factor;
Conference_Titel :
Data Compression Conference (DCC), 2014
Conference_Location :
Snowbird, UT
DOI :
10.1109/DCC.2014.46