DocumentCode
3041973
Title
Predicting Class-II MHC Binding Peptide Using Global Representation of Peptides
Author
Niu, Yanqing
Author_Institution
Sch. of Math. & Stat., South-Central Univ. for Nat., Wuhan, China
fYear
2011
fDate
14-17 Dec. 2011
Firstpage
308
Lastpage
312
Abstract
Peptide and major histocompatibility complex class II molecule (MHC-II) binding is the key of activating T-cell immune response. The peptides binding with MHC molecules can be well known as T-cell epitopes, and identifying epitopes is the critical for the computer-aided drug design. However, the variable lengths of binding peptides undermine the use of traditional machine learning methods. In this paper, we propose a method that can utilize whole peptides to predict MHC-II binding affinity by using sequence-derived structure and physicochemical properties. First of all, several groups of structural and physicochemical features derived from protein sequences are adopted, which can transform varied-length peptides into fixed-length feature vectors. Thus, sequence-derived features are combined together, and the optimal feature subset was selected by MRMR (minimum Redundancy Maximum Relevance Feature Selection). Subsequently, support vector machines (SVM) are used as the classification engine to construct the prediction models. The performances of our models are evaluated on the benchmark datasets. When compared to the existing popular quantitative methods, our proposed method can give out better or comparable performance, yielding an average AUC of 0.82 on the IEDB datasets, an average AUC of 0.82 on Wang´s dataset. The proposed method yields satisfying performance over existing methods by using full-length representation of the peptides.
Keywords
CAD; benchmark testing; biochemistry; biology computing; cellular biophysics; drugs; learning (artificial intelligence); molecular biophysics; proteins; support vector machines; IEDB datasets; Wang dataset; activating T-cell immune response; as T-cell epitopes; benchmark datasets; class-II MHC binding peptide; computer-aided drug design; fixed-length feature vectors; global representation; histocompatibility complex class II molecule binding; machine learning methods; minimum redundancy maximum relevance feature selection; peptides binding; physicochemical properties; popular quantitative methods; protein; sequence-derived structure; support vector machines; Amino acids; Bioinformatics; Correlation; Encoding; Immune system; Peptides; Proteins; MHC-II quantitative prediction; T-cell immunity; feature selection; sequence-derived structure and physicochemical features;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Computation and Bio-Medical Instrumentation (ICBMI), 2011 International Conference on
Conference_Location
Wuhan, Hubei
Print_ISBN
978-1-4577-1152-7
Type
conf
DOI
10.1109/ICBMI.2011.74
Filename
6131770
Link To Document