Title of article :
iBLP: An XGBoost-Based Predictor for Identifying Bioluminescent Proteins
Author/Authors :
Zhang, Dan School of Life Science and Technology and Center for Informational Biology - University of Electronic Science and Technology of China - Chengdu, China , Chen, Hua-Dong School of Basic Medical Sciences - Fujian Medical University - Fuzhou, China , Zulfiqar, Hasan School of Life Science and Technology and Center for Informational Biology - University of Electronic Science and Technology of China - Chengdu, China , Yuan, Shi-Shi School of Life Science and Technology and Center for Informational Biology - University of Electronic Science and Technology of China - Chengdu, China , Huang, Qin-Lai School of Life Science and Technology and Center for Informational Biology - University of Electronic Science and Technology of China - Chengdu, China , Zhang, Zhao-Yue School of Life Science and Technology and Center for Informational Biology - University of Electronic Science and Technology of China - Chengdu, China , Deng, Ke-Jun School of Life Science and Technology and Center for Informational Biology - University of Electronic Science and Technology of China - Chengdu, China
Abstract :
Bioluminescent proteins (BLPs) are a class of proteins that widely distributed in many living organisms with various mechanisms of
light emission including bioluminescence and chemiluminescence from luminous organisms. Bioluminescence has been
commonly used in various analytical research methods of cellular processes, such as gene expression analysis, drug discovery,
cellular imaging, and toxicity determination. However, the identification of bioluminescent proteins is challenging as they share
poor sequence similarities among them. In this paper, we briefly reviewed the development of the computational identification
of BLPs and subsequently proposed a novel predicting framework for identifying BLPs based on eXtreme gradient boosting
algorithm (XGBoost) and using sequence-derived features. To train the models, we collected BLP data from bacteria, eukaryote,
and archaea. Then, for getting more effective prediction models, we examined the performances of different feature extraction
methods and their combinations as well as classification algorithms. Finally, based on the optimal model, a novel predictor
named iBLP was constructed to identify BLPs. The robustness of iBLP has been proved by experiments on training and
independent datasets. Comparison with other published method further demonstrated that the proposed method is powerful
and could provide good performance for BLP identification. The webserver and software package for BLP identification are
freely available at http://lin-group.cn/server/iBLP.
Keywords :
XGBoost , Bioluminescent , iBLP , BLP
Journal title :
Computational and Mathematical Methods in Medicine