DocumentCode
2134447
Title
Prediction of protein kinase-specific phosphorylation sites using Random forest algorithm
Author
Wenwen Fan ; Liang Zou ; Ao Li ; Minghui Wang
Author_Institution
Sch. of Inf. Sci. & Technol., Univ. of Sci. & Technol. of China, Hefei, China
fYear
2012
fDate
16-18 Oct. 2012
Firstpage
986
Lastpage
989
Abstract
Reversible phosphorylation is an important procedure to control the activity of proteins in biological cellular regulatory processes. Experimental identification of kinase-specific phosphorylation sites in substrates is very costly in both time and labor. It is desirable to develop machine learning methods which are rapid and effective for prediction. In this paper, we adopted Random forest (RF) algorithm for prediction of phosphorylation sites. Comparison with Bayesian Decision Theory (BDT) and Support Vector Machine (SVM) on four kinase/kinase family datasets showed RF consistent better performance. For example, on MAPK data RF algorithm achieved an AUC of 0.97, which was 0.04 and 0.03 higher than those of BDT and SVM, respectively. In addition, by maintaining a high specificity of 99%, the sensitivity of RF algorithm reached 66%, which was 25% and 23% higher than those of BDT and SVM, respectively. These results showed that RF is a powerful machine learning algorithm for protein phosphorylation site prediction.
Keywords
biochemistry; biology computing; cellular biophysics; decision theory; enzymes; learning (artificial intelligence); molecular biophysics; substrates; support vector machines; AUC; BDT; MAPK data; RF algorithm high specificity; RF algorithm sensitivity; SVM; bayesian decision theory; biological cellular regulatory process; experimental identification; kinase dataset; kinase family dataset; machine learning method; protein activity control; protein kinase-specific phosphorylation site prediction; protein phosphorylation site prediction; random forest algorithm; reversible phosphorylation; substrate kinase-specific phosphorylation site; support vector machine; bioinformatics; phosphorylation; random forest;
fLanguage
English
Publisher
ieee
Conference_Titel
Biomedical Engineering and Informatics (BMEI), 2012 5th International Conference on
Conference_Location
Chongqing
Print_ISBN
978-1-4673-1183-0
Type
conf
DOI
10.1109/BMEI.2012.6513035
Filename
6513035
Link To Document