DocumentCode
3519216
Title
Using Global Sequence Similarity to Enhance Biological Sequence Labeling
Author
Caragea, Cornelia ; Sinapov, Jivko ; Dobbs, Drena ; Honavar, Vasant
Author_Institution
Comput. Sci. Dept., Iowa State Univ., Ames, IA
fYear
2008
fDate
3-5 Nov. 2008
Firstpage
104
Lastpage
111
Abstract
Identifying functionally important sites from biological sequences, formulated as a biological sequence labeling problem, has broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks. In this paper, we present an approach to biological sequence labeling that takes into account the global similarity between biological sequences. Our approach combines unsupervised and supervised learning techniques. Given a set of sequences and a similarity measure defined on pairs of sequences, we learn a mixture of experts model by using spectral clustering to learn the hierarchical structure of the model and by using bayesian approaches to combine the predictions of the experts. We evaluate our approach on two important biological sequence labeling problems: RNA-protein and DNA-protein interface prediction problems. The results of our experiments show that global sequence similarity can be exploited to improve the performance of classifiers trained to label biological sequence data.
Keywords
DNA; biology computing; learning (artificial intelligence); DNA-protein interface; RNA-protein interface; bayesian approaches; biological sequence labeling; global sequence similarity; hierarchical structure; Bioinformatics; Biological information theory; Biological system modeling; Biology; Computer science; Labeling; Predictive models; Proteins; Sequences; Supervised learning; biological sequence labeling; global similarity; mixture of experts;
fLanguage
English
Publisher
ieee
Conference_Titel
Bioinformatics and Biomedicine, 2008. BIBM '08. IEEE International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
978-0-7695-3452-7
Type
conf
DOI
10.1109/BIBM.2008.54
Filename
4684880
Link To Document