DocumentCode :
2325540
Title :
Recognizing patterns in protein sequences using iteration-performing calculations in genetic programming
Author :
Koza, John R.
Author_Institution :
Dept. of Comput. Sci., Stanford Univ., CA, USA
fYear :
1994
fDate :
27-29 Jun 1994
Firstpage :
244
Abstract :
Uses genetic programming with automatically defined functions (ADFs) for the dynamic creation of a pattern-recognizing computer program consisting of initially-unknown detectors, an initially-unknown iterative calculation incorporating the as-yet-undiscovered detectors, and an initially-unspecified final calculation incorporating the results of the as-yet-unspecified iteration. The program´s goal is to recognize a given protein segment as being a transmembrane domain or non-transmembrane area of the protein. Genetic programming with automatic function definition is given a training set of differently-sized mouse protein segments and their correct classification. Correlation is used as the fitness measure. Automatic function definition enables genetic programming to dynamically create subroutines (detectors). A restricted form of iteration is introduced to enable genetic programming to perform calculations on the values returned by the detectors. When cross-validated, the best genetically-evolved recognizer for transmembrane domains achieves an out-of-sample correlation of 0.968 and an out-of-sample error rate of 1.6%. This error rate is better than that recently reported for five other methods
Keywords :
biology computing; biomembranes; functions; genetic algorithms; iterative methods; pattern recognition; proteins; subroutines; automatic function definition; automatically defined functions; classification; correlation; detectors; dynamic creation; dynamic subroutine creation; fitness measure; genetic programming; initially-unknown detectors; initially-unknown iterative calculation; initially-unspecified final calculation; iteration-performing calculations; mouse; nontransmembrane area; pattern-recognizing computer program; protein segment recognition; protein sequences; training set; transmembrane domain; undiscovered detectors; unspecified iteration; Amino acids; Biomembranes; Chemicals; Computer science; Detectors; Error analysis; Genetic programming; Mice; Pattern recognition; Proteins;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Evolutionary Computation, 1994. IEEE World Congress on Computational Intelligence., Proceedings of the First IEEE Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
0-7803-1899-4
Type :
conf
DOI :
10.1109/ICEC.1994.350008
Filename :
350008
Link To Document :
بازگشت