Abstract :
Proteolytic processing occurs predominantly at basic amino acid residues. The existence of the cleavage sites not recognized by rules proposed in previous studies prompts us to test whether, and to what extent, the sites cleave. Due to the imbalanced cleavage site database from SWISS, Smote combined with Tomek links is applied to over-sample the data. A neural network method is then developed to predict the probabilities of proteolytic processing occurrences at the sites in neuropeptide precursors. The sensitivities are 91%, 93%, 91%, 90%, 79% and 83% for KR, RR, RK, KK, R and K, respectively, which is significantly better than previous prediction schemes
Keywords :
biology computing; molecular biophysics; neural nets; proteins; SWISS; amino acid residues; neural network method; neuropeptide precursors; probability; proteolytic cleavage sites; proteolytic processing; Amino acids; Biochemistry; Data analysis; Databases; Economic forecasting; Frequency; Intelligent networks; Neural networks; Peptides; Personal communication networks;