Title :
Feature recognition on expressed sequence tags of human DNA
Author :
Hatzigeorgiou, Artemis G. ; Reczko, Martin
Author_Institution :
Synaptic Ltd., Heraklion, Greece
Abstract :
Expressed sequence tags (EST) are small parts of DNA, which are used to clone new genes. One main characteristic of EST is that they contain more than 1% sequencing errors. What we need to know is which parts of the EST contain information about proteins, the so called coding regions. In this paper we describe an error-tolerant program for the prediction of such coding regions in EST. The program is based on a combination of statistical methods and several artificial neural networks (ANN). 89.7% of the nucleotides of a independent test set with 127 EST´s are predicted correctly as to whether they are coding or noncoding. These results are independent of the existence of homologous gene or protein sequences and representative for the application to the largest part of all EST
Keywords :
DNA; biology computing; feature extraction; neural nets; pattern recognition; statistical analysis; ANN; EST; artificial neural networks; coding regions; error-tolerant program; expressed sequence tags; feature recognition; homologous gene sequences; human DNA; independent test set; nucleotides; protein sequences; proteins; sequencing errors; statistical methods; Bioinformatics; Cloning; DNA; Error correction; Genomics; Humans; Polymers; Proteins; Sequences; Technical Activities Guide -TAG;
Conference_Titel :
Neural Networks, 1999. IJCNN '99. International Joint Conference on
Conference_Location :
Washington, DC
Print_ISBN :
0-7803-5529-6
DOI :
10.1109/IJCNN.1999.836228