DocumentCode :
2338312
Title :
RBT-Km: K-Means clustering for Multiple Sequence Alignment
Author :
Taheri, Javid ; Zomaya, Albert Y.
Author_Institution :
Sch. of Inf. Technol., Univ. of Sydney, Sydney, NSW, Australia
fYear :
2010
fDate :
16-19 May 2010
Firstpage :
1
Lastpage :
8
Abstract :
This paper presents a novel approach for solving the Multiple Sequence Alignment (MSA) problem. K-Means clustering is combined with the Rubber Band Technique (RBT) to introduce an iterative optimization algorithm, namely RBT-Km, to find the optimal alignment for a set of input protein sequences. In this technique, the MSA problem is modeled as a Rubber Band, while the solution space is modeled as plate with several poles corresponding locations in the input sequences that are most likely to be correlated and/or biologically related. K-Means clustering is then used to discriminate biologically related locations from those that may appear by chance. RBT-Km is tested with one of the well-known benchmarks in this field (BALiBASE 2.0). The results demonstrate the superiority of the proposed technique even in the case of formidable sequences.
Keywords :
benchmark testing; bioinformatics; iterative methods; optimisation; pattern clustering; proteins; sequences; MSA problem; RBT-Km; benchmark testing; iterative optimization algorithm; k-means clustering; multiple sequence alignment; protein sequence; rubber band technique; Algorithm design and analysis; Benchmark testing; Biology; Clustering algorithms; Hidden Markov models; Optimization; Rubber; K-Means Clustering; Multiple Sequence Alignment;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Systems and Applications (AICCSA), 2010 IEEE/ACS International Conference on
Conference_Location :
Hammamet
Print_ISBN :
978-1-4244-7716-6
Type :
conf
DOI :
10.1109/AICCSA.2010.5586934
Filename :
5586934
Link To Document :
بازگشت