DocumentCode
3151022
Title
Iterative progressive alignment method (IPAM) for multiple sequence alignment
Author
Naznin, Farhana ; Sarker, Ruhul ; Essam, Daryl
Author_Institution
Australian Defence Force Acad., Univ. of New South Wales, Sydney, NSW, Australia
fYear
2009
fDate
6-9 July 2009
Firstpage
536
Lastpage
541
Abstract
In order to design life saving drugs, such as cancer drugs, the design of protein or DNA structures have to be accurate. These structures depend on multiple sequence alignment (MSA). MSA is a combinatorial optimization problem which is used to find the accurate structure of protein and DNA sequences from the existing sequences. In this paper, we have proposed a new iterative progressive alignment method, for multiple sequence alignment, which is a close variant of the MUSCEL algorithm. MUSCEL starts with the ldquokmerrdquo distance table. However, based on the gene sequences length, our algorithm starts either with the ldquokmerrdquo distance table or with the ldquodynamic programming (DP)rdquo distance table. The other steps of this algorithm include: generating a guide tree using UPGMA, multiple sequence alignments, ldquokimurardquo distance calculation from aligned sequences and new techniques to improve multiple sequence alignments. We have introduced two new techniques in this research: the first technique is to generate guide trees with randomly selected sequences and the second is of shuffling the sequences inside that tree. The output of the tree is a multiple sequence alignment which has been evaluated by the sum of pairs method (SPM) considering the real value data from PAM250. To test the performance of our algorithm, we have compared with the existing well known methods: T-Coffee, MUSCEL, MAFFT and Probcon, using BAliBase benchmarks and NCBI based our own datasets. The experimental results show that the proposed method works well for some situations, where other methods face difficulties in obtaining better solutions.
Keywords
biocomputing; dynamic programming; iterative methods; optimisation; sequential estimation; trees (mathematics); DNA structure design; MUSCEL algorithm; UPGMA algorithm; combinatorial optimization; dynamic programming distance table; gene sequences length; guide trees; iterative progressive alignment method; kmer distance table; multiple sequence alignment; protein design; sum of pairs method; Amino acids; Australia; DNA; Drugs; Dynamic programming; Iterative algorithms; Iterative methods; Polymers; Proteins; Sequences; Dynamic Programming (DP); Guide-tree; Multiple Sequence Alignment (MSA); Progressive Alignment;
fLanguage
English
Publisher
ieee
Conference_Titel
Computers & Industrial Engineering, 2009. CIE 2009. International Conference on
Conference_Location
Troyes
Print_ISBN
978-1-4244-4135-8
Electronic_ISBN
978-1-4244-4136-5
Type
conf
DOI
10.1109/ICCIE.2009.5223562
Filename
5223562
Link To Document