DocumentCode
2470301
Title
The use of support vector machine and genetic algorithms to predict protein function
Author
Resende, Walkíria K. ; Nascimento, Renato A. ; Xavier, Carolina R. ; Lopes, Iara F. ; Nobre, Cristiane N.
Author_Institution
Dept. of Comput. Sci., Univ. Fed. de Sao Joao del Rei, Sao Joao del Rei, Brazil
fYear
2012
fDate
14-17 Oct. 2012
Firstpage
1773
Lastpage
1778
Abstract
In Bioinformatics, the prediction of protein function is considered a very important task but also difficult. Using a set of enzymes represented by Hydrolase, Isomerase, Ligase, Lyase, Transferase and Oxidoreductase classes, previously used by Dobson et al., this paper proposes a self-learning process able to predict their classes, based on their primary and secondary structures, through a Support Vector Machine (SVM) classifier and genetic algorithm. An SVM can be characterized as a supervised machine learning algorithm capable of resolving linear and non-linear classification problems. During the learning process, both the training data and the corresponding output are presented to the SVM to allow its parameters to be adjusted. This study utilized genetic algorithms - optimization heuristics often used to estimate parameters - to adjust the main parameters of the classifier such as kernel function type and parameter C, which provides the relationship between the training error and the margin of separation between classes. In this specific prediction problem, the results indicate that the best function is an RBF where width is 6.1 and C is 6.9. Using these parameters, the classifier obtains an average accuracy of 79.74%.
Keywords
bioinformatics; genetic algorithms; learning (artificial intelligence); parameter estimation; pattern classification; proteins; radial basis function networks; support vector machines; RBF; SVM classifier; bioinformatics; genetic algorithms; hydrolase class; isomerase class; kernel function type; ligase class; lyase class; nonlinear classification problems; optimization heuristics; oxidoreductase class; parameter C; parameter estimation; primary structures; protein function prediction; secondary structures; self-learning process; supervised machine learning algorithm; support vector machine; training data; transferase class; Accuracy; Amino acids; Genetic algorithms; Kernel; Proteins; Support vector machines; Genetic Algorithms; Prediction; Proteins; Support Vector Machine;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on
Conference_Location
Seoul
Print_ISBN
978-1-4673-1713-9
Electronic_ISBN
978-1-4673-1712-2
Type
conf
DOI
10.1109/ICSMC.2012.6377994
Filename
6377994
Link To Document