Title :
Predict Gram-Positive and Gram-Negative Subcellular Localization via Incorporating Evolutionary Information and Physicochemical Features Into Chou's General PseAAC
Author :
Sharma, Ronesh ; Dehzangi, Abdollah ; Lyons, James ; Paliwal, Kuldip ; Tsunoda, Tatsuhiko ; Sharma, Alok
Author_Institution :
Sch. of Electr. & Electron. Eng., Fiji Nat. Univ., Suva, Fiji
Abstract :
In this study, we used structural and evolutionary based features to represent the sequences of gram-positive and gram-negative subcellular localizations. To do this, we proposed a normalization method to construct a normalize Position Specific Scoring Matrix (PSSM) using the information from original PSSM. To investigate the effectiveness of the proposed method we compute feature vectors from normalize PSSM and by applying support vector machine (SVM) and naïve Bayes classifier, respectively, we compared achieved results with the previously reported results. We also computed features from original PSSM and normalized PSSM and compared their results. The archived results show enhancement in gram-positive and gram-negative subcellular localizations. Evaluating localization for each feature, our results indicate that employing SVM and concatenating features (amino acid composition feature, Dubchak feature (physicochemical-based features), normalized PSSM based auto-covariance feature and normalized PSSM based bigram feature) have higher accuracy while employing naïve Bayes classifier with normalized PSSM based auto-covariance feature proves to have high sensitivity for both benchmarks. Our reported results in terms of overall locative accuracy is 84.8% and overall absolute accuracy is 85.16% for gram-positive dataset; and, for gram-negative dataset, overall locative accuracy is 85.4% and overall absolute accuracy is 86.3%.
Keywords :
Bayes methods; biochemistry; biology computing; cellular biophysics; evolutionary computation; feature extraction; molecular biophysics; molecular configurations; pattern classification; support vector machines; Chou´s general PseAAC; Dubchak feature; SVM; amino acid composition feature; computed features; evolutionary based features; evolutionary information; gram-negative dataset; gram-negative subcellular localization sequences; gram-positive dataset; gram-positive subcellular localization sequences; naive Bayes classifier; normalized PSSM based autocovariance feature; normalized position specific scoring matrix; physicochemical-based features; structural based features; support vector machine; Accuracy; Amino acids; Benchmark testing; Feature extraction; Microorganisms; Proteins; Support vector machines; Evolutionary-based features; normalized PSSM;
Journal_Title :
NanoBioscience, IEEE Transactions on
DOI :
10.1109/TNB.2015.2500186