DocumentCode
741050
Title
Enhanced Protein Fold Prediction Method Through a Novel Feature Extraction Technique
Author
Leyi Wei ; Minghong Liao ; Xing Gao ; Quan Zou
Author_Institution
Sch. of Software, Xiamen Univ., Xiamen, China
Volume
14
Issue
6
fYear
2015
Firstpage
649
Lastpage
659
Abstract
Information of protein 3-dimensional (3D) structures plays an essential role in molecular biology, cell biology, biomedicine, and drug design. Protein fold prediction is considered as an immediate step for deciphering the protein 3D structures. Therefore, protein fold prediction is one of fundamental problems in structural bioinformatics. Recently, numerous taxonomic methods have been developed for protein fold prediction. Unfortunately, the overall prediction accuracies achieved by existing taxonomic methods are not satisfactory although much progress has been made. To address this problem, we propose a novel taxonomic method, called PFPA, which is featured by combining a novel feature set through an ensemble classifier. Particularly, the sequential evolution information from the profiles of PSI-BLAST and the local and global secondary structure information from the profiles of PSI-PRED are combined to construct a comprehensive feature set. Experimental results demonstrate that PFPA outperforms the state-of-the-art predictors. To be specific, when tested on the independent testing set of a benchmark dataset, PFPA achieves an overall accuracy of 73.6%, which is the leading accuracy ever reported. Moreover, PFPA performs well without significant performance degradation on three updated large-scale datasets, indicating the robustness and generalization of PFPA. Currently, a webserver that implements PFPA is freely available on http://121.192.180.204:8080/PFPA/Index.html.
Keywords
benchmark testing; bioinformatics; feature extraction; molecular biophysics; molecular configurations; pattern classification; proteins; PFPA; PSI-BLAST; benchmark dataset; biomedicine; cell biology; drug design; enhanced protein fold prediction method; ensemble classifier; feature extraction technique; feature set; global secondary structure information; independent testing set; local secondary structure information; molecular biology; protein 3-dimensional structures; sequential evolution information; state-of-the-art predictors; structural bioinformatics; taxonomic methods; updated large-scale datasets; Accuracy; Amino acids; Feature extraction; Protein engineering; Proteins; Testing; Three-dimensional displays; Ensemble classifier; feature extraction; protein fold prediction;
fLanguage
English
Journal_Title
NanoBioscience, IEEE Transactions on
Publisher
ieee
ISSN
1536-1241
Type
jour
DOI
10.1109/TNB.2015.2450233
Filename
7229420
Link To Document