• DocumentCode
    2796555
  • Title

    Prediction of protein subcellular localization with a novel method: Sequence-segmented PSEAAC

  • Author

    Zhang, Shao-Wu ; Yang, Hui-Fang ; Li, Qi-Peng ; Cheng, Yong-mei ; Pan, Quan

  • Author_Institution
    Coll. of Autom., Northwestern Polytech. Univ., Xi´´an
  • Volume
    7
  • fYear
    2008
  • fDate
    12-15 July 2008
  • Firstpage
    4024
  • Lastpage
    4028
  • Abstract
    Information of the subcellular localizations of proteins is important because it can provide useful insights about their functions, as well as how and in what kind of cellular environments they interact with each other and with other molecules. Facing the explosion of newly generated protein sequences in the post genomic era, we are challenged to develop an automated method for fast and reliably annotating their subcellular localizations. To tackle the challenge, a novel method of the sequence-segmented pseudo amino acid composition (PseAAC) is introduced to represent protein samples. Based on the concept of Choupsilas PseAAC, a series of useful information and techniques, such as multi-scale energy and moment descriptors were utilized to generate the sequence-segmented pseudo amino acid components for representing the protein samples. Meanwhile, the multi-class SVM classifier modules were adopted for predicting 16 kinds of eukaryotic protein subcellular localizations. Compared with existing methods, this new approach provides better predictive performance. The success total accuracies were obtained in the jackknife test and independent dataset test, suggesting that the sequence-segmented PseAAC method is quite promising, and might also hold a great potential as a useful vehicle for the other areas of molecular biology.
  • Keywords
    biology computing; macromolecules; molecular biophysics; proteins; support vector machines; molecular biology; moment descriptors; multiscale energy; protein subcellular localization; pseudo amino acid composition; sequence-segmented PSEAAC; Amino acids; Bioinformatics; Explosions; Genomics; Protein engineering; Sequences; Support vector machine classification; Support vector machines; Testing; Vehicles; moment descriptor; multi-scale energy; sequence-segmented PseAAC; support vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2008 International Conference on
  • Conference_Location
    Kunming
  • Print_ISBN
    978-1-4244-2095-7
  • Electronic_ISBN
    978-1-4244-2096-4
  • Type

    conf

  • DOI
    10.1109/ICMLC.2008.4621106
  • Filename
    4621106