Title of article :
Preprocessing peptide sequences for multivariate sequence-property analysis
Author/Authors :
Andersson، نويسنده , , Per M. and Sjِstrِm، نويسنده , , Michael and Lundstedt، نويسنده , , Torbjِrn، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 1998
Abstract :
The increasing number of peptide sequences with different lengths, available from synthesised peptide libraries and sequenced proteins are potentially valuable for evaluating structure–activity relationships. However, in order to apply multivariate classification or Quantitative Structure–Activity Relationship (QSAR) analyses on such sequences, it is necessary to have a preprocessing method that translates them into a uniform set of variables. By describing each amino acid by principal properties (z-scales) and then calculating auto cross covariances (ACCs) for each sequence, a new uniform matrix is generated, i.e., each sequence is described by a vector with equal length. The ACC approach has been used before for classification of peptides, but here, a QSAR analysis based on 20 peptide sequences of different lengths is presented. The results show that it is possible to obtain a predictive multivariate QSAR model (R2Ycum=86.2%, Qcum2=60.3%) based on the ACC preprocessing method, together with Orthogonal Signal Correction (OSC) and Partial Least Squares (PLS). The model generated was further validated by permutation tests and found to be valid. The new variables generated by ACCs can also be interpreted, i.e., used to identify important features in the original sequences.
Keywords :
Peptide sequences , peptide libraries , z-Scales , Auto cross covariances , QSAR , OSC , PLS
Journal title :
Chemometrics and Intelligent Laboratory Systems
Journal title :
Chemometrics and Intelligent Laboratory Systems