DocumentCode
3341927
Title
Fast Dynamic Programming Based Sequence Alignment Algorithm
Author
Rashid, N.A.A. ; Abdullah, Rosni ; Talib, Abdullah Zawawi Haji ; Ali, Zalila
Author_Institution
Sch. of Comput. Sci., Sci. Univ. of Malaysia, Penang
fYear
2006
fDate
38838
Firstpage
1
Lastpage
7
Abstract
Protein sequence alignment is basic operation mostly used in protein sequence analysis. The most optimal algorithm used in sequence alignment is based on the dynamic programming method. Smith-Waterman algorithm is the most commonly used dynamic programming based sequence alignment algorithm. However the algorithm uses quadratic time and space. Heuristic algorithm such as FASTA and BLAST were introduced to speed up the sequence alignment algorithm. FASTA is based on word search whereas BLAST is based on maximum segment pairs. In word search algorithm, lists of words from the query and database sequence are being compared to determine if two sequences have a region of sufficient similarity to merit further alignment using the Smith-Waterman Algorithm. All the different algorithms use the substitutions matrix based on the twenty alphabet amino acids. However research shows that reducing the number of amino acids to 10 does not affect the similarity measure. Our proposed algorithm uses the reduced amino acids alphabet to transform the protein sequences into a sequence of integer and uses n-gram to reduce the length of the sequence. Then the Smith-Waterman algorithm is used to get the similarity measure between two sequences. Result shows that the new proposed algorithm is as sensitive as the Smith-Waterman algorithm yet uses less space and performs better
Keywords
biology computing; dynamic programming; heuristic programming; molecular biophysics; proteins; query processing; BLAST; FASTA; Smith-Waterman algorithm; amino acids; dynamic programming; protein sequence alignment; Algorithm design and analysis; Amino acids; Biomedical signal processing; Databases; Dynamic programming; Heuristic algorithms; Natural language processing; Protein sequence; Signal processing algorithms; Space technology; N-Gram methods; Reduce Amino Acids alphabet; Sequence Alignment;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Frameworks for Multimedia Applications, 2006. The 2nd International Conference on
Conference_Location
Pulau Pinang
Print_ISBN
1-4244-0409-6
Type
conf
DOI
10.1109/DFMA.2006.296909
Filename
4077734
Link To Document