Title :
Improved method for predicting RNA-binding residues using random forest from primary sequence
Author :
Ma, Xin ; Yang, Yang
Author_Institution :
Golden Audit College, Nanjing Audit University, China
Abstract :
Protein-RNA interactions play important role in a variety of biological processes in cells. An improved method is proposed for predicting RNA-binding residues from amino acids sequences which combines a novel hybrid feature with a random forest (RF) algorithm. The hybrid feature contains the evolutionary information, the secondary structure information and two novel features reflected the information about correlation of amino acids with regards to hydrophobicity and polarity-charge in protein sequences respectively. The prediction classifier achieves 0.5042 Matthew´s correlation coefficient (MCC) and 85.17% overall accuracy (ACC) with 52.40% sensitivity (SE) and 92.89% specificity [1] respectively. Further analysis proves that two novel features and the evolutionary information contribute most to the prediction improvement. The results obtained from the comparisons with previous works clearly show that our prediction model has significant better prediction performance of RNA-binding residues in proteins.
Keywords :
Accuracy; Amino acids; Classification algorithms; Correlation; Proteins; Radio frequency; Support vector machines; RNA-binding residues; dependency of amino acids; position specific scoring matrices (PSSMs); random forest (RF);
Conference_Titel :
Information Science and Engineering (ICISE), 2010 2nd International Conference on
Conference_Location :
Hangzhou, China
Print_ISBN :
978-1-4244-7616-9
DOI :
10.1109/ICISE.2010.5690816