• DocumentCode
    2134447
  • Title

    Prediction of protein kinase-specific phosphorylation sites using Random forest algorithm

  • Author

    Wenwen Fan ; Liang Zou ; Ao Li ; Minghui Wang

  • Author_Institution
    Sch. of Inf. Sci. & Technol., Univ. of Sci. & Technol. of China, Hefei, China
  • fYear
    2012
  • fDate
    16-18 Oct. 2012
  • Firstpage
    986
  • Lastpage
    989
  • Abstract
    Reversible phosphorylation is an important procedure to control the activity of proteins in biological cellular regulatory processes. Experimental identification of kinase-specific phosphorylation sites in substrates is very costly in both time and labor. It is desirable to develop machine learning methods which are rapid and effective for prediction. In this paper, we adopted Random forest (RF) algorithm for prediction of phosphorylation sites. Comparison with Bayesian Decision Theory (BDT) and Support Vector Machine (SVM) on four kinase/kinase family datasets showed RF consistent better performance. For example, on MAPK data RF algorithm achieved an AUC of 0.97, which was 0.04 and 0.03 higher than those of BDT and SVM, respectively. In addition, by maintaining a high specificity of 99%, the sensitivity of RF algorithm reached 66%, which was 25% and 23% higher than those of BDT and SVM, respectively. These results showed that RF is a powerful machine learning algorithm for protein phosphorylation site prediction.
  • Keywords
    biochemistry; biology computing; cellular biophysics; decision theory; enzymes; learning (artificial intelligence); molecular biophysics; substrates; support vector machines; AUC; BDT; MAPK data; RF algorithm high specificity; RF algorithm sensitivity; SVM; bayesian decision theory; biological cellular regulatory process; experimental identification; kinase dataset; kinase family dataset; machine learning method; protein activity control; protein kinase-specific phosphorylation site prediction; protein phosphorylation site prediction; random forest algorithm; reversible phosphorylation; substrate kinase-specific phosphorylation site; support vector machine; bioinformatics; phosphorylation; random forest;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Biomedical Engineering and Informatics (BMEI), 2012 5th International Conference on
  • Conference_Location
    Chongqing
  • Print_ISBN
    978-1-4673-1183-0
  • Type

    conf

  • DOI
    10.1109/BMEI.2012.6513035
  • Filename
    6513035