DocumentCode :
1307143
Title :
Identifying Bacterial Virulent Proteins by Fusing a Set of Classifiers Based on Variants of Chou´s Pseudo Amino Acid Composition and on Evolutionary Information
Author :
Nanni, L. ; Lumini, A. ; Gupta, D. ; Garg, A.
Author_Institution :
Dept. of Inf. Eng., Univ. of Padua, Padova, Italy
Volume :
9
Issue :
2
fYear :
2012
Firstpage :
467
Lastpage :
475
Abstract :
The availability of a reliable prediction method for prediction of bacterial virulent proteins has several important applications in research efforts targeted aimed at finding novel drug targets, vaccine candidates, and understanding virulence mechanisms in pathogens. In this work, we have studied several feature extraction approaches for representing proteins and propose a novel bacterial virulent protein prediction method, based on an ensemble of classifiers where the features are extracted directly from the amino acid sequence and from the evolutionary information of a given protein. We have evaluated and compared several ensembles obtained by combining six feature extraction methods and several classification approaches based on two general purpose classifiers (i.e., Support Vector Machine and a variant of input decimated ensemble) and their random subspace version. An extensive evaluation was performed according to a blind testing protocol, where the parameters of the system are optimized using the training set and the system is validated in three different independent data sets, allowing selection of the most performing system and demonstrating the validity of the proposed method. Based on the results obtained using the blind test protocol, it is interesting to note that even if in each independent data set the most performing stand-alone method is not always the same, the fusion of different methods enhances prediction efficiency in all the tested independent data sets.
Keywords :
bioinformatics; drugs; feature extraction; learning (artificial intelligence); microorganisms; proteins; support vector machines; Chou pseudoamino acid composition; a blind testing protocol; bacterial virulent protein prediction method; classifiers based variants; drug targets; evolutionary information; feature extraction methods; fusion; general purpose classifiers; independent data set; input decimated ensemble; pathogens; random subspace version; stand-alone method; support vector machine; vaccine candidates; Amino acids; Bioinformatics; Computational biology; Encoding; Feature extraction; Microorganisms; Proteins; Virulent proteins; ensemble of classifiers; machine learning; support vector machines.; Adhesins, Bacterial; Amino Acids; Bacterial Proteins; Computational Biology; Databases, Protein; Evolution, Molecular; Fimbriae Proteins; Support Vector Machines;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2011.117
Filename :
5999656
Link To Document :
بازگشت