DocumentCode :
3714564
Title :
A multi-stage protein secondary structure prediction system using machine learning and information theory
Author :
Masood Zamani;Stefan C. Kremer
Author_Institution :
School of Computer Science, University of Guelph, Canada
fYear :
2015
Firstpage :
1304
Lastpage :
1309
Abstract :
In this paper, we evaluated the performance of a multi-stage protein secondary structure (PSS) prediction model. The proposed classifier uses statistical information and protein profiles. The statistical information is derived from protein sequences and structures by using a k-means clustering technique and Information theory. In the first stage, a feed-forward artificial neural network maps a sequence fragment to a region in the Ramachandran plot (2D-plot). A score vector is constructed with the mapped region using clustering and statistical information. The score vector represents the tendency of pairing an identified region in the 2D-plot and secondary structures for a residue. The score vectors which are used in the second stage have fewer dimensions compared to input vectors that are commonly derived from protein sequences or profile information. In the second stage, a two-tier classifier is employed based on an artificial neural network and a genetic programming (GP) method. The GP method uses IF rules for a three-state classification. The two-tier classifier´s performance is compared to those of two-tier artificial neural networks (ANNs) and support vector machines (SVMs). The prediction method is examined with a common protein dataset, RS126. The performance of the proposed classification model is measured based on Q3 and segment overlap (SOV) scores. The proposed PSS prediction model improves over 3% the Q3 score and 2% the SOV score in comparison to those of two-tier ANN and SVMs architectures.
Keywords :
"Artificial neural networks","Information theory","Proteins","Support vector machines"
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/BIBM.2015.7359867
Filename :
7359867
Link To Document :
بازگشت