Title :
An efficient compressor for biological sequences
Author :
Gupta, Arpan ; Dubey, K.K.
Author_Institution :
Dept. of CS & IT, MJP Rohilkhand Univ., Bareilly, India
Abstract :
This paper introduces a state of art compressor for DNA sequences that makes use of a replacement method. The replacement method introduces words and a word based compression scheme is used for encoding. The encoder uses frequency distribution for assigning the code of words. The designed statistical compression algorithm is efficient and effective for DNA sequence compression. Experiments show that our algorithm is shown to outperform existing compressors on typical DNA sequence datasets.
Keywords :
DNA; bioinformatics; data compression; encoding; statistical analysis; DNA sequence compression; DNA sequence datasets; biological sequence compressor; encoding; frequency distribution; replacement method; state of art compressor; statistical compression algorithm; word based compression scheme; Biological information theory; Compression algorithms; Context; DNA; Dictionaries; Encoding; Vocabulary; DNA compression; DNA sequences; Word based tagged code;
Conference_Titel :
Advance Computing Conference (IACC), 2013 IEEE 3rd International
Conference_Location :
Ghaziabad
Print_ISBN :
978-1-4673-4527-9
DOI :
10.1109/IAdCC.2013.6514310