DocumentCode :
580058
Title :
GNeg-CEF: Gram-negative bacterial protein prediction using ensemble approach for individual feature extraction strategies
Author :
Zahur, Awais ; Majid, Abdul
Author_Institution :
Dept. of Comput. & Inf. Sci., Pakistan Inst. of Eng. & Appl. Sci., Islamabad, Pakistan
fYear :
2012
fDate :
8-9 Oct. 2012
Firstpage :
1
Lastpage :
5
Abstract :
The importance of automatically annotating the subcellular attributes of uncharacterized proteins and its timely utilization in drug discovery is self-evident. This accurate information about protein locations in a cell facilitates in the understanding of the function of a protein and further interaction in the cellular environment. We proposed a novel GNeg-CEF approach for predicting gram-negative bacterial subcellular locations. In the proposed scheme, we exploited diversity both in feature and decision spaces. In order to exploit diversity in feature space, we used six feature extraction strategies; Amino Acid Composition (AAC), Split Amino Acid Composition (SAAC), Pseudo Amino Acid Composition (PseAAC) parallel, PseAAC Series, Dipeptide Composition (DC), and Sequential Evolution (PseEvo). Diversity in decision space is exploited using three state of the art classification models; Support Vector Machine, k-Nearest Neighbor, and Back Propagation Neural Network. First, the performance of individual ensemble classifiers for single feature extraction technique is evaluated. Next, the improved performance of the composite ensemble GNeg-CEF of all individual ensembles is investigated using majority voting scheme for gram-negative bacterial protein dataset.
Keywords :
backpropagation; bioinformatics; cellular biophysics; feature extraction; microorganisms; neural nets; pattern classification; proteins; support vector machines; GNeg-CEF; back propagation neural network; cell protein location; cellular environment; classification model; decision space; dipeptide composition; drug discovery; ensemble approach; ensemble classifier; feature extraction strategy; feature space; gram-negative bacterial protein prediction; k-nearest neighbor; majority voting scheme; predicting gram-negative bacterial subcellular location; protein function; pseudo amino acid composition; sequential evolution; split amino acid composition; subcellular attribute; support vector machine; uncharacterized protein; Amino acids; Databases; Feature extraction; Microorganisms; Predictive models; Proteins; Support vector machines; amino acid; bacteria; ensemble classifiers; subcellular location;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Emerging Technologies (ICET), 2012 International Conference on
Conference_Location :
Islamabad
Print_ISBN :
978-1-4673-4452-4
Type :
conf
DOI :
10.1109/ICET.2012.6375475
Filename :
6375475
Link To Document :
بازگشت