Title :
A computational method for assessing peptide-identification reliability in tandem mass spectrometry analysis with SEQUEST
Author :
Razumovskaya, Jane ; Olman, Victor ; Xu, Dong ; Uberbacher, Ed ; Verbermoes, Nathan ; Xu, Ying
Author_Institution :
Life Sci. Div., Oak Ridge Nat. Lab., TN, USA
Abstract :
High throughput protein identification in mass spectrometry is predominantly achieved by first identifying tryptic peptides using SEQUEST and then by combining the peptide hits for protein identification. Peptide identification is typically carried out by selecting SEQUEST hits above a specified threshold, the value of which is typically chosen empirically in an attempt to separate true identifications from the false ones. These SEQUEST scores are not normalized with respect to the composition, length and other parameters of the peptides. Furthermore, there is no rigorous reliability estimate assigned to the protein identifications derived from these scores. Hence the interpretation of SEQUEST hits generally requires human involvement, making it difficult to scale up the identification process for genome-scale applications. To overcome these limitations, we have developed a method, which combines a neural network and a statistical model, for "normalizing" SEQUEST scores, and also for providing a reliability estimate for each SEQUEST hit. This method improves the sensitivity and specificity of peptide identification compared to the standard filtering procedure used in the SEQUEST package, and provides a basis for estimating the reliability of protein identifications.
Keywords :
biology computing; genetics; mass spectroscopy; neural nets; physiological models; proteins; statistical analysis; SEQUEST score normalization; computational method; genome-scale applications; neural network; peptide-identification reliability; protein identification; statistical model; tandem mass spectrometry analysis; tryptic peptides; Bioinformatics; Filtering; Genomics; Humans; Mass spectroscopy; Neural networks; Peptides; Proteins; Sensitivity and specificity; Throughput;
Conference_Titel :
Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE
Print_ISBN :
0-7695-2000-6
DOI :
10.1109/CSB.2003.1227353