Title :
Peptide Sequence Tag-Based Blind Identification-based SVM Model
Author :
Li, Hui ; Liu, Chunmei ; Liu, Xumin ; Diakite, Macire ; Burge, Legand ; Yakubu, Abdul-Aziz ; Southerland, William
Author_Institution :
Dept. of Syst. & Comput. Sci., Howard Univ., Washington, DC, USA
Abstract :
Identifying the ion types for a mass spectrum is essential for interpreting the spectrum and deriving its peptide sequence. In this paper, we proposed a novel method for identifying ion types and deriving matched peptide sequences for tandem mass spectra. We first divided our dataset into a training set and a testing set and then preprocessed the data using a Support Vector Machine and a 5-fold cross validation based dual denoting model. Then we constructed a syntax tree and generated a rule set to match the mass values from experimental mass spectra with the mass spectral values from corresponding theoretical mass spectra. Finally we applied the proposed algorithm to a tandem mass spectral dataset consisting of 2656 spectra from yeast. Compared with other methods, the experimental results showed that the proposed method can effectively filter noise and successfully derive peptide sequences.
Keywords :
biology computing; computational linguistics; identification; mass spectra; mass spectroscopy; proteins; proteomics; support vector machines; 5-fold cross validation based dual denoting model; SVM model; ion identification; peptide sequence tag-based blind identification; rule set generation; support vector machine; syntax tree; tandem mass spectra; Accuracy; Amino acids; Noise; Peptides; Proteins; Support vector machines; Training; Context free grammar; Support Vector Machine; Tandem mass spectrum;
Conference_Titel :
Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on
Conference_Location :
Washington, DC
Print_ISBN :
978-1-4244-9211-4
DOI :
10.1109/ICMLA.2010.156