DocumentCode
3610337
Title
Predict Gram-Positive and Gram-Negative Subcellular Localization via Incorporating Evolutionary Information and Physicochemical Features Into Chou's General PseAAC
Author
Sharma, Ronesh ; Dehzangi, Abdollah ; Lyons, James ; Paliwal, Kuldip ; Tsunoda, Tatsuhiko ; Sharma, Alok
Author_Institution
Sch. of Electr. & Electron. Eng., Fiji Nat. Univ., Suva, Fiji
Volume
14
Issue
8
fYear
2015
Firstpage
915
Lastpage
926
Abstract
In this study, we used structural and evolutionary based features to represent the sequences of gram-positive and gram-negative subcellular localizations. To do this, we proposed a normalization method to construct a normalize Position Specific Scoring Matrix (PSSM) using the information from original PSSM. To investigate the effectiveness of the proposed method we compute feature vectors from normalize PSSM and by applying support vector machine (SVM) and naïve Bayes classifier, respectively, we compared achieved results with the previously reported results. We also computed features from original PSSM and normalized PSSM and compared their results. The archived results show enhancement in gram-positive and gram-negative subcellular localizations. Evaluating localization for each feature, our results indicate that employing SVM and concatenating features (amino acid composition feature, Dubchak feature (physicochemical-based features), normalized PSSM based auto-covariance feature and normalized PSSM based bigram feature) have higher accuracy while employing naïve Bayes classifier with normalized PSSM based auto-covariance feature proves to have high sensitivity for both benchmarks. Our reported results in terms of overall locative accuracy is 84.8% and overall absolute accuracy is 85.16% for gram-positive dataset; and, for gram-negative dataset, overall locative accuracy is 85.4% and overall absolute accuracy is 86.3%.
Keywords
Bayes methods; biochemistry; biology computing; cellular biophysics; evolutionary computation; feature extraction; molecular biophysics; molecular configurations; pattern classification; support vector machines; Chou´s general PseAAC; Dubchak feature; SVM; amino acid composition feature; computed features; evolutionary based features; evolutionary information; gram-negative dataset; gram-negative subcellular localization sequences; gram-positive dataset; gram-positive subcellular localization sequences; naive Bayes classifier; normalized PSSM based autocovariance feature; normalized position specific scoring matrix; physicochemical-based features; structural based features; support vector machine; Accuracy; Amino acids; Benchmark testing; Feature extraction; Microorganisms; Proteins; Support vector machines; Evolutionary-based features; normalized PSSM;
fLanguage
English
Journal_Title
NanoBioscience, IEEE Transactions on
Publisher
ieee
ISSN
1536-1241
Type
jour
DOI
10.1109/TNB.2015.2500186
Filename
7328300
Link To Document