Title :
Protein secondary structure prediction using periodic-quadratic-logistic models: statistical and theoretical issues
Author :
Munson, Peter J. ; Di Francesco, Valentina ; Porrelli, Raul
Author_Institution :
Div. of Comput. Res. & Technol., Nat. Inst. of Health, Bethesda, MD, USA
Abstract :
We extend logistic discriminant function methodology to compete effectively with neural networks and "information theory" methods in prediction of protein secondary structure. Unlike "black-box" methods, our model produces 400 pairwise interaction parameters which are interpretable from a molecular standpoint. Under optimal conditions, our model can produce up to 65.9% crossvalidated prediction accuracy on three states. A broad family of models is searched using a semi-parametric (penalized) approach combined with stepwise parameter selection. We show that optimal models have about 800 effective parameters for this data set. The highest prediction accuracy is concentrated in a fraction of the total residues, and the confidence of a prediction can be easily calculated. Such high-confidence predictions may be useful as the basis for prediction of the complete structure of the protein.<>
Keywords :
biology computing; information theory; maximum likelihood estimation; neural nets; proteins; black-box methods; crossvalidated prediction accuracy; high-confidence predictions; information theory; logistic discriminant function methodology; maximum likelihood logistic models; neural networks; optimal conditions; pairwise interaction parameters; periodic-quadratic-logistic models; prediction accuracy; protein secondary structure prediction; semi-parametric approach; stepwise parameter selection;
Conference_Titel :
System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on
Conference_Location :
Wailea, HI, USA
Print_ISBN :
0-8186-5090-7
DOI :
10.1109/HICSS.1994.323556