• DocumentCode
    3116956
  • Title

    Eukaryotic Protein Subcellular Localization Based on Local Pairwise Profile Alignment SVM

  • Author

    Guo, Jian ; Mak, Man-Wai ; Kung, Sun-Yuan

  • Author_Institution
    Dept. of Electron. & Inf. Eng., Hong Kong Polytech. Univ., Hongkong
  • fYear
    2006
  • fDate
    6-8 Sept. 2006
  • Firstpage
    391
  • Lastpage
    396
  • Abstract
    This paper studies the use of profile alignment and support vector machines for subcellular localization. In the training phase, the profiles of all protein sequences in the training set are constructed by PSI-BLAST and the pairwise profile-alignment scores are used to form feature vectors for training a support vector machine (SVM) classifier. During testing, the profile of a query protein sequence is computed and aligned with all the profiles constructed during training to obtain a feature vector for classification by the SVM classifier. Tests on Reinhardt and Hubbard´s eukaryotic protein dataset show that the total accuracy can reach 99.4%, which is significantly higher than those obtained by methods based on sequence alignments and amino acid composition. It was also found that the proposed method can still achieves a prediction accuracy of 96% even if none of the sequence pairs in the dataset contains more than 5% identity. This paper also demonstrates that the performance of the SVM is proportional to the degree of its kernel matrix meeting the Mercer´s condition.
  • Keywords
    biology computing; cellular biophysics; feature extraction; pattern classification; proteins; support vector machines; Mercer condition; PSI-BLAST; amino acid composition; eukaryotic protein subcellular localization; feature vectors; kernel matrix; local pairwise profile alignment; protein sequence; sequence alignment; support vector machine classifier; Accuracy; Amino acids; Frequency; Hidden Markov models; Kernel; Protein engineering; Protein sequence; Support vector machine classification; Support vector machines; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning for Signal Processing, 2006. Proceedings of the 2006 16th IEEE Signal Processing Society Workshop on
  • Conference_Location
    Arlington, VA
  • ISSN
    1551-2541
  • Print_ISBN
    1-4244-0656-0
  • Electronic_ISBN
    1551-2541
  • Type

    conf

  • DOI
    10.1109/MLSP.2006.275581
  • Filename
    4053680