• DocumentCode
    3408693
  • Title

    Prediction of O-linked Glycosylation Sites in Protein Sequence by PCA-LDA

  • Author

    Yang, Xue-Mei

  • Author_Institution
    Coll. of Math. & Inf. Sci., Xianyang Normal Univ., Xianyang, China
  • Volume
    1
  • fYear
    2009
  • fDate
    12-14 Aug. 2009
  • Firstpage
    158
  • Lastpage
    161
  • Abstract
    O-glycosylation is one of the main types of the mammalian protein glycosylation, it occurs on the particular site of serine and threonine. In this paper, a new method of PCA-LDA is used for the prediction of O-glycosylation site under all kinds of window size (5,7,9,11,21,31,41,51). The new method of PCA-LDA is the combination of PCA and LDA, we also call it hybrid discriminate analysis (HDA). The test protein sequence which is encoded by the sparse coding is projected to the one-dimensional subspace and then by calculating the Mahanalobis distance between the projection and each class center, the test protein sequence is assigned into the "nearest" class, so it can be known that whether a particular site of serine and threonine is glycosylated. The result of experiments shows that the proposed method of HDA is more effective and accurate. The prediction accuracy is about 75%-92.5%.
  • Keywords
    molecular configurations; principal component analysis; proteins; proteomics; HDA; Mahanalobis distance; O-linked glycosylation site prediction; PCA-LDA; hybrid discriminate analysis; mammalian protein glycosylation; protein sequence; serine site; threonine site; Accuracy; Amino acids; Educational institutions; Hybrid intelligent systems; Information science; Mathematics; Principal component analysis; Protein sequence; Support vector machines; Testing; HDA; classification; glycosylation; prediction; protein; sparse coding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Hybrid Intelligent Systems, 2009. HIS '09. Ninth International Conference on
  • Conference_Location
    Shenyang
  • Print_ISBN
    978-0-7695-3745-0
  • Type

    conf

  • DOI
    10.1109/HIS.2009.39
  • Filename
    5254306