• DocumentCode
    2456530
  • Title

    Peptide Sequence Tag-Based Blind Identification-based SVM Model

  • Author

    Li, Hui ; Liu, Chunmei ; Liu, Xumin ; Diakite, Macire ; Burge, Legand ; Yakubu, Abdul-Aziz ; Southerland, William

  • Author_Institution
    Dept. of Syst. & Comput. Sci., Howard Univ., Washington, DC, USA
  • fYear
    2010
  • fDate
    12-14 Dec. 2010
  • Firstpage
    979
  • Lastpage
    984
  • Abstract
    Identifying the ion types for a mass spectrum is essential for interpreting the spectrum and deriving its peptide sequence. In this paper, we proposed a novel method for identifying ion types and deriving matched peptide sequences for tandem mass spectra. We first divided our dataset into a training set and a testing set and then preprocessed the data using a Support Vector Machine and a 5-fold cross validation based dual denoting model. Then we constructed a syntax tree and generated a rule set to match the mass values from experimental mass spectra with the mass spectral values from corresponding theoretical mass spectra. Finally we applied the proposed algorithm to a tandem mass spectral dataset consisting of 2656 spectra from yeast. Compared with other methods, the experimental results showed that the proposed method can effectively filter noise and successfully derive peptide sequences.
  • Keywords
    biology computing; computational linguistics; identification; mass spectra; mass spectroscopy; proteins; proteomics; support vector machines; 5-fold cross validation based dual denoting model; SVM model; ion identification; peptide sequence tag-based blind identification; rule set generation; support vector machine; syntax tree; tandem mass spectra; Accuracy; Amino acids; Noise; Peptides; Proteins; Support vector machines; Training; Context free grammar; Support Vector Machine; Tandem mass spectrum;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on
  • Conference_Location
    Washington, DC
  • Print_ISBN
    978-1-4244-9211-4
  • Type

    conf

  • DOI
    10.1109/ICMLA.2010.156
  • Filename
    5708980