DocumentCode
2092732
Title
Prediction of Mucin-type O-glycosylation by Support Vector Machines
Author
Nishikawa, Ikuko ; Sakamoto, Hirotaka ; Nouno, Ikue ; Sakakibara, Kazutoshi ; Ito, Masahiro
Author_Institution
Ritsumeikan Univ., Kusatsu
fYear
2007
fDate
23-27 May 2007
Firstpage
1870
Lastpage
1874
Abstract
Mucin-type O-glycosylation is one of the main types of the mammalian protein glycosylation. It is serine (Ser) or threonine (Thr) specific, though any consensus sequence is still unknown. In this report, support vector machines (SVM) are used for the prediction of O-glycosylation for each Ser or Thr site in the protein sequences. 99 mammalian protein sequences are selected from UniProt8.0. A certain length of a protein subsequence with Ser or Thr site at the center is used as input data to SVM, after the encoding in three ways. That is, sparse encoding, 5-letter encoding, and multiple encoding which uses both sparse and 5-letter encodings. The results of prediction experiments show that multiple encoding is most effective. The effective prediction requires the detailed information on amino acid residues in the nearest neighbors of the prediction target site, and the relatively rough information of biochemical characteristics on amino acid residues within approximately the 15th nearest neighbors of the target site. In addition, it is observed that the ratio of positive to negative data for the learning affects the performance.
Keywords
association; biology computing; molecular biophysics; proteins; sugar; support vector machines; 5 letter encoding; SVM learning; UniProt8.0; amino acid residue; mammalian protein glycosylation; mammalian protein sequence; mucin type O glycosylation prediction; multiple encoding; protein subsequence length; serine; sparse encoding; support vector machine; threonine; Alzheimer´s disease; Amino acids; Biological information theory; Databases; Encoding; Indium tin oxide; Lipidomics; Nearest neighbor searches; Protein sequence; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Complex Medical Engineering, 2007. CME 2007. IEEE/ICME International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-1077-4
Electronic_ISBN
978-1-4244-1078-1
Type
conf
DOI
10.1109/ICCME.2007.4382072
Filename
4382072
Link To Document