Title :
Using Gaussian Process with Test Rejection to Detect T-Cell Epitopes in Pathogen Genomes
Author :
You, Liwen ; Brusic, Vladimir ; Gallagher, Marcus ; Bodén, Mikael
Author_Institution :
Dept. of Theor. Phys., Univ. of Lund, Lund, Sweden
Abstract :
A major challenge in the development of peptide-based vaccines is finding the right immunogenic element, with efficient and long-lasting immunization effects, from large potential targets encoded by pathogen genomes. Computer models are convenient tools for scanning pathogen genomes to preselect candidate immunogenic peptides for experimental validation. Current methods predict many false positives resulting from a low prevalence of true positives. We develop a test reject method based on the prediction uncertainty estimates determined by Gaussian process regression. This method filters false positives among predicted epitopes from a pathogen genome. The performance of stand-alone Gaussian process regression is compared to other state-of-the-art methods using cross validation on 11 benchmark data sets. The results show that the Gaussian process method has the same accuracy as the top performing algorithms. The combination of Gaussian process regression with the proposed test reject method is used to detect true epitopes from the Vaccinia virus genome. The test rejection increases the prediction accuracy by reducing the number of false positives without sacrificing the method´s sensitivity. We show that the Gaussian process in combination with test rejection is an effective method for prediction of T-cell epitopes in large and diverse pathogen genomes, where false positives are of concern.
Keywords :
Gaussian processes; drugs; genomics; microorganisms; regression analysis; Gaussian process regression; T-cell epitope detection; Vaccinia virus genome; computer models; false positives; immunogenic element; pathogen genomes; peptide-based vaccines; prediction uncertainty; test reject method; test rejection; true epitopes; Bioinformatics; Filters; Gaussian processes; Genomics; Immune system; Pathogens; Peptides; Testing; Uncertainty; Vaccines; False positive; Gaussian processes; Immunology; Machine learning; Regression; amino acid sequence; epitope; false positives.; machine learning; regression; Algorithms; Epitope Mapping; Epitopes, T-Lymphocyte; Genome, Viral; Normal Distribution; Vaccinia virus;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2008.131