• DocumentCode
    1630710
  • Title

    Prediction of Protein Subcellular Localizations

  • Author

    Yu, Chin-Sheng ; Hwang, Jenn-Kang

  • Author_Institution
    Dept. of Inf. Eng. & Comput. Sci., Feng Chia Univ., Taichung
  • Volume
    1
  • fYear
    2008
  • Firstpage
    165
  • Lastpage
    170
  • Abstract
    The support vector machine (SVM) method based on n-peptide composition (Yu et al, Proteins: Struct. Funct. Genet. 2003:50:531-536) is used to predict the subcellular localizations of proteins. For an unbiased assessment of the results, we apply our approach to two independent data sets: one set consisting of two parts (Reinhardt and Hubbard, Nucleic Acids Res. 1998; 26:2230-2236): the prokaryotic set includes 997 protein sequences in three categories and the eukaryotic set includes 2427 sequences in four localization categories; another set comprising 2191 proteins in 12 subcellular localizations (Chou and Cai, J. Biol. Chem. 2002; 277:45765-45769). Our approach provides excellent results for both data sets. For the first data set, our approach gives an overall prediction accuracy 93.2% for prokaryotic sequences, 88.1% for eukaryotic sequences. Our approach also yields significantly better Matthews correlation coefficient for each subcellular localization than the existing approaches. For the second data set, our approach achieves an overall prediction accuracy 83.2%, which is also around 10% higher than the best existing result. Our approaches should be valuable in the high throughput analysis of genomics and proteomics.
  • Keywords
    biology computing; molecular biophysics; proteins; support vector machines; Matthews correlation coefficient; eukaryotic set; genomics; n-peptide composition; protein sequences; protein subcellular localization prediction; proteomics; support vector machine; Accuracy; Amino acids; Bioinformatics; Genomics; Intelligent systems; Machine intelligence; Organisms; Protein engineering; Sequences; Support vector machines; genomics; n–peptide composition; proteomics; subcellular localization prediction; support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference on
  • Conference_Location
    Kaohsiung
  • Print_ISBN
    978-0-7695-3382-7
  • Type

    conf

  • DOI
    10.1109/ISDA.2008.306
  • Filename
    4696197