Title :
Separation of ion types in tandem mass spectrometry data interpretation - a graph-theoretic approach
Author :
Yan, Bo ; Pan, Chongle ; Olman, Victor N. ; Hettich, Robert L. ; Xu, Ying
Author_Institution :
Georgia Univ., Athens, GA, USA
Abstract :
Mass spectrometry is one of the most popular analytical techniques for identification of individual proteins in a protein mixture, one of the basic problems in proteomics. It identifies a protein through identifying its unique mass spectral pattern. While the problem is theoretically solvable, it remains a challenging problem computationally. One of the key challenges comes from the difficulty in distinguishing the N- and C-terminus ions, mostly b- and y-ions respectively. In this paper, we present a graph algorithm for solving the problem of separating b- from y-ions in a set of mass spectra. We represent each spectral peak as a node and consider two types of edges: a type-1 edge connects two peaks possibly of the same ion types and a type-2 edge connects two peaks possibly of different ion types, predicted based on local information. The ion-separation problem is then formulated and solved as a graph partition problem, which is to partition the graph into three subgraphs, namely b-, y-ions and others respectively, so to maximize the total weight of type-1 edges while minimizing the total weight of type-2 edges within each subgraph. We have developed a dynamic programming algorithm for rigorously solving this graph partition problem and implemented it as a computer program PRIME. We have tested PRIME on 18 data sets of high accurate FT-ICR tandem mass spectra and found that it achieved ∼90% accuracy for separation of b- and y-ions.
Keywords :
Fourier transform spectra; biology computing; genetics; graph theory; mass spectra; molecular biophysics; proteins; C-terminus ions; FT-ICR tandem mass spectra; N-terminus ions; b-ions; computer program PRIME; data interpretation; dynamic programming algorithm; graph partition problem; graph theory; ion separation; protein identification; proteomics; tandem mass spectrometry; y-ions; Bioinformatics; Chemical technology; Computational biology; Databases; Genomics; Mass spectroscopy; Peptides; Protein sequence; Proteomics; Sequences;
Conference_Titel :
Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
Print_ISBN :
0-7695-2194-0
DOI :
10.1109/CSB.2004.1332437