DocumentCode :
2470301
Title :
The use of support vector machine and genetic algorithms to predict protein function
Author :
Resende, Walkíria K. ; Nascimento, Renato A. ; Xavier, Carolina R. ; Lopes, Iara F. ; Nobre, Cristiane N.
Author_Institution :
Dept. of Comput. Sci., Univ. Fed. de Sao Joao del Rei, Sao Joao del Rei, Brazil
fYear :
2012
fDate :
14-17 Oct. 2012
Firstpage :
1773
Lastpage :
1778
Abstract :
In Bioinformatics, the prediction of protein function is considered a very important task but also difficult. Using a set of enzymes represented by Hydrolase, Isomerase, Ligase, Lyase, Transferase and Oxidoreductase classes, previously used by Dobson et al., this paper proposes a self-learning process able to predict their classes, based on their primary and secondary structures, through a Support Vector Machine (SVM) classifier and genetic algorithm. An SVM can be characterized as a supervised machine learning algorithm capable of resolving linear and non-linear classification problems. During the learning process, both the training data and the corresponding output are presented to the SVM to allow its parameters to be adjusted. This study utilized genetic algorithms - optimization heuristics often used to estimate parameters - to adjust the main parameters of the classifier such as kernel function type and parameter C, which provides the relationship between the training error and the margin of separation between classes. In this specific prediction problem, the results indicate that the best function is an RBF where width is 6.1 and C is 6.9. Using these parameters, the classifier obtains an average accuracy of 79.74%.
Keywords :
bioinformatics; genetic algorithms; learning (artificial intelligence); parameter estimation; pattern classification; proteins; radial basis function networks; support vector machines; RBF; SVM classifier; bioinformatics; genetic algorithms; hydrolase class; isomerase class; kernel function type; ligase class; lyase class; nonlinear classification problems; optimization heuristics; oxidoreductase class; parameter C; parameter estimation; primary structures; protein function prediction; secondary structures; self-learning process; supervised machine learning algorithm; support vector machine; training data; transferase class; Accuracy; Amino acids; Genetic algorithms; Kernel; Proteins; Support vector machines; Genetic Algorithms; Prediction; Proteins; Support Vector Machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4673-1713-9
Electronic_ISBN :
978-1-4673-1712-2
Type :
conf
DOI :
10.1109/ICSMC.2012.6377994
Filename :
6377994
Link To Document :
بازگشت