• DocumentCode
    3041973
  • Title

    Predicting Class-II MHC Binding Peptide Using Global Representation of Peptides

  • Author

    Niu, Yanqing

  • Author_Institution
    Sch. of Math. & Stat., South-Central Univ. for Nat., Wuhan, China
  • fYear
    2011
  • fDate
    14-17 Dec. 2011
  • Firstpage
    308
  • Lastpage
    312
  • Abstract
    Peptide and major histocompatibility complex class II molecule (MHC-II) binding is the key of activating T-cell immune response. The peptides binding with MHC molecules can be well known as T-cell epitopes, and identifying epitopes is the critical for the computer-aided drug design. However, the variable lengths of binding peptides undermine the use of traditional machine learning methods. In this paper, we propose a method that can utilize whole peptides to predict MHC-II binding affinity by using sequence-derived structure and physicochemical properties. First of all, several groups of structural and physicochemical features derived from protein sequences are adopted, which can transform varied-length peptides into fixed-length feature vectors. Thus, sequence-derived features are combined together, and the optimal feature subset was selected by MRMR (minimum Redundancy Maximum Relevance Feature Selection). Subsequently, support vector machines (SVM) are used as the classification engine to construct the prediction models. The performances of our models are evaluated on the benchmark datasets. When compared to the existing popular quantitative methods, our proposed method can give out better or comparable performance, yielding an average AUC of 0.82 on the IEDB datasets, an average AUC of 0.82 on Wang´s dataset. The proposed method yields satisfying performance over existing methods by using full-length representation of the peptides.
  • Keywords
    CAD; benchmark testing; biochemistry; biology computing; cellular biophysics; drugs; learning (artificial intelligence); molecular biophysics; proteins; support vector machines; IEDB datasets; Wang dataset; activating T-cell immune response; as T-cell epitopes; benchmark datasets; class-II MHC binding peptide; computer-aided drug design; fixed-length feature vectors; global representation; histocompatibility complex class II molecule binding; machine learning methods; minimum redundancy maximum relevance feature selection; peptides binding; physicochemical properties; popular quantitative methods; protein; sequence-derived structure; support vector machines; Amino acids; Bioinformatics; Correlation; Encoding; Immune system; Peptides; Proteins; MHC-II quantitative prediction; T-cell immunity; feature selection; sequence-derived structure and physicochemical features;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Computation and Bio-Medical Instrumentation (ICBMI), 2011 International Conference on
  • Conference_Location
    Wuhan, Hubei
  • Print_ISBN
    978-1-4577-1152-7
  • Type

    conf

  • DOI
    10.1109/ICBMI.2011.74
  • Filename
    6131770