DocumentCode :
3576227
Title :
A bio-sequence k-mer frequency counter (kFC)
Author :
Biji, C.L. ; Nair, Achuthsankar S. ; Madhu, Manu K. ; Vijayakumar, R.
Author_Institution :
Dept. of Comput. Biol. & Bioinf., Univ. of Kerala, Thiruvananthapuram, India
fYear :
2014
Firstpage :
353
Lastpage :
356
Abstract :
The high-throughput sequencing data from next generation sequencing technologies demands the need for over presented k-mers for being de novo sequenced and assembled. Even though generating k-mer frequency distribution of a sequence seems to be a simple task, memory usage and time are two important concerns especially for higher order mers. This paper proposes a method to find count of over represented k-mers in bio-sequences. The approach uses a hash table with open address scheme for estimating the frequency count of k-mers. The algorithm support both overlapping and non-overlapping pattern for nucleotide sequences, amino acid sequences and Next generation sequencing read sequences. Moreover, it even accept nucleotide sequences from the extended alphabet set in contrast to the traditional k-mer tool which accepts only the standard alphabet.
Keywords :
biology computing; data analysis; organic compounds; sequences; alphabet set; amino acid sequences; bio-sequence k-mer frequency counter; de novo assembled; de novo sequenced; hash table; high-throughput sequencing data; k-mer frequency distribution; k-mer tool; kFC; memory time; memory usage; next generation sequencing technologies; nonoverlapping pattern; nucleotide sequences; open address scheme; read data analysis; read sequences; Algorithm design and analysis; Animals; Arrays; Bioinformatics; Genomics; Random access memory; Sequential analysis; Biosequence analysis; Read data analysis; k-mer counter;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Circuits, Communication, Control and Computing (I4C), 2014 International Conference on
Print_ISBN :
978-1-4799-6545-8
Type :
conf
DOI :
10.1109/CIMCA.2014.7057822
Filename :
7057822
Link To Document :
بازگشت