DocumentCode
464311
Title
Real Value Solvent Accessibility Prediction using Adaptive Support Vector Regression
Author
Gubbi, Jayavardhana ; Shilton, Alistair ; Palaniswami, M. ; Parker, Michael
Author_Institution
Dept. of Electr. & Electron. Eng., Melbourne Univ., Parkville, Vic.
fYear
2007
fDate
1-5 April 2007
Firstpage
395
Lastpage
401
Abstract
Knowledge of the secondary structure and solvent accessibility of a protein plays a vital role in prediction of fold, and eventually the tertiary structure of the protein. This paper deals with prediction of relative solvent accessibility, given only the amino-acid sequence. In this paper, we use an improved support vector regression (SVR) and new kernels for real valued prediction of solvent accessibility. In this regard, two main issues are addressed. First we address the problem of e selection, which we found to be somewhat problematic in our earlier work (e is a parameter with significant influence on noise insensitivity and generalization of SVRs). In particular, rather than employ the standard trial and error based approach, we used an improved tube shrinking method to find e. Secondly, a novel kernel combining solvation model, electrostatic charge model and evolutionary information in the form of position specific scoring matrix (PSSM) is given. A new dataset of 472 proteins with less than 20% sequence identity is curated and used to evaluate the result. To make a more objective comparison with earlier methods, we use a standard dataset and show that the proposed scheme is better than the ones normally used in literature. We also report a lowest mean absolute error (MAE) so far of 0.12 on the standard dataset.
Keywords
biology computing; proteins; regression analysis; support vector machines; adaptive support vector regression; electrostatic charge model; evolutionary information; position specific scoring matrix; protein secondary structure; protein tertiary structure; real value solvent accessibility prediction; solvation model; Feedforward neural networks; Feedforward systems; Kernel; Multi-layer neural network; Neural networks; Proteins; Sequences; Solvents; Support vector machine classification; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Bioinformatics and Computational Biology, 2007. CIBCB '07. IEEE Symposium on
Conference_Location
Honolulu, HI
Print_ISBN
1-4244-0710-9
Type
conf
DOI
10.1109/CIBCB.2007.4221249
Filename
4221249
Link To Document