Author/Authors :
Wang, Yaoxin Zhejiang Sci-Tech University - Hangzhou, China , Xu, Yingjie Qixin School - Zhejiang Sci-Tech University - Hangzhou, China , Yang, Zhenyu Zhejiang Sci-Tech University - Hangzhou, China , Liu, Xiaoqing Hangzhou Dianzi University - Hangzhou, China , Dai, Qi Zhejiang Sci-Tech University - Hangzhou, China
Abstract :
Many combinations of protein features are used to improve protein structural class prediction, but the information redundancy is
often ignored. In order to select the important features with strong classification ability, we proposed a recursive feature selection
with random forest to improve protein structural class prediction. We evaluated the proposed method with four experiments and
compared it with the available competing prediction methods. The results indicate that the proposed feature selection method
effectively improves the efficiency of protein structural class prediction. Only less than 5% features are used, but the prediction
accuracy is improved by 4.6-13.3%. We further compared different protein features and found that the predicted secondary
structural features achieve the best performance. This understanding can be used to design more powerful prediction methods
for the protein structural class.