Title :
MAHATMA: A Genetic Programming-Based Tool for Protein Classification
Author :
Tsunoda, Denise F. ; Freitas, Alex A. ; Lopes, Heitor S.
Author_Institution :
Fed. Univ. of Parana, Curitiba, Brazil
fDate :
Nov. 30 2009-Dec. 2 2009
Abstract :
Proteins can be grouped into families according to some features such as hydrophobicity, composition or structure, aiming to establish common biological functions. This paper presents a system that was conceived to discover features (particular sequences of amino acids, or motifs) that occur very often in proteins of a given family but rarely occur in proteins of other families. These features can be used for the classification of unknown proteins, that is, to predict their function by analyzing their primary structure. Experiments were done with a set of enzymes extracted from the protein data bank. The heuristic method used was based on genetic programming using operators specially tailored for the target problem. The final performance was measured using sensitivity (Se) and specificity (Sp). The best results obtained for the enzyme dataset suggest that the proposed evolutionary computation method is very effective to find predictive features (motifs) for protein classification.
Keywords :
biology computing; genetic algorithms; pattern classification; proteins; MAHATMA; amino acids; biological functions; enzymes; evolutionary computation method; genetic programming-based tool; heuristic method; motifs; protein classification; protein data bank; Amino acids; Biochemistry; Data mining; Evolutionary computation; Genetic programming; Intelligent structures; Intelligent systems; Peptides; Proteins; Sequences; genetic programming; protein classification;
Conference_Titel :
Intelligent Systems Design and Applications, 2009. ISDA '09. Ninth International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-1-4244-4735-0
Electronic_ISBN :
978-0-7695-3872-3
DOI :
10.1109/ISDA.2009.14