DocumentCode
3408693
Title
Prediction of O-linked Glycosylation Sites in Protein Sequence by PCA-LDA
Author
Yang, Xue-Mei
Author_Institution
Coll. of Math. & Inf. Sci., Xianyang Normal Univ., Xianyang, China
Volume
1
fYear
2009
fDate
12-14 Aug. 2009
Firstpage
158
Lastpage
161
Abstract
O-glycosylation is one of the main types of the mammalian protein glycosylation, it occurs on the particular site of serine and threonine. In this paper, a new method of PCA-LDA is used for the prediction of O-glycosylation site under all kinds of window size (5,7,9,11,21,31,41,51). The new method of PCA-LDA is the combination of PCA and LDA, we also call it hybrid discriminate analysis (HDA). The test protein sequence which is encoded by the sparse coding is projected to the one-dimensional subspace and then by calculating the Mahanalobis distance between the projection and each class center, the test protein sequence is assigned into the "nearest" class, so it can be known that whether a particular site of serine and threonine is glycosylated. The result of experiments shows that the proposed method of HDA is more effective and accurate. The prediction accuracy is about 75%-92.5%.
Keywords
molecular configurations; principal component analysis; proteins; proteomics; HDA; Mahanalobis distance; O-linked glycosylation site prediction; PCA-LDA; hybrid discriminate analysis; mammalian protein glycosylation; protein sequence; serine site; threonine site; Accuracy; Amino acids; Educational institutions; Hybrid intelligent systems; Information science; Mathematics; Principal component analysis; Protein sequence; Support vector machines; Testing; HDA; classification; glycosylation; prediction; protein; sparse coding;
fLanguage
English
Publisher
ieee
Conference_Titel
Hybrid Intelligent Systems, 2009. HIS '09. Ninth International Conference on
Conference_Location
Shenyang
Print_ISBN
978-0-7695-3745-0
Type
conf
DOI
10.1109/HIS.2009.39
Filename
5254306
Link To Document