DocumentCode :
3265268
Title :
Hypermotifs: Novel Discriminatory Patterns for Nucleotide Sequences and their Application to Core Promoter Prediction in Eukaryotes
Author :
Pridgeon, Carey ; Corne, David
Author_Institution :
SECAM, Harrison Building University of Exeter Exeter EX4 4QF, UK, c.pridgeon@exeter.ac.uk
fYear :
2005
fDate :
14-15 Nov. 2005
Firstpage :
1
Lastpage :
7
Abstract :
We approach the general problem of finding a model that discriminates between classes of nucleotide sequences. In this area, a common approach is to train a model, such as a neural network, or a hidden Markov model, to perform the discrimination, using as inputs either the raw sequences encoded in a standard form, or features derived from the raw data in a pre-processing stage. In this paper a novel discriminatory pattern structure for nucleotide sequences is introduced, called a hypermotif, and evolutionary computation is used to evolve a collection of specific hypermotifs which discriminate between classes in the data. The raw nucleotide data are then processed, transforming it into feature vectors, where the features are the individual scores on the evolved hypermotifs. Using this transformation, any classification method may then be used to build an accurate predictive model. The approach is tested on a database of eukaryotic promoters, and find that this method enables us to outperform a standard multilayer perceptron (despite using a linear discriminant as the final classifier), and provides similar performance to the best approach so far for these data (which uses a time delay neural network)
Keywords :
Bioinformatics; Computational intelligence; Delay effects; Evolutionary computation; Hidden Markov models; Multilayer perceptrons; Neural networks; Predictive models; Spatial databases; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology, 2005. CIBCB '05. Proceedings of the 2005 IEEE Symposium on
Print_ISBN :
0-7803-9387-2
Type :
conf
DOI :
10.1109/CIBCB.2005.1594949
Filename :
1594949
Link To Document :
بازگشت