DocumentCode :
3394022
Title :
Protein secondary structure prediction using rule induction from coverings
Author :
Lee, Leong ; Leopold, Jennifer L. ; Frank, Ronald L. ; Maglia, Anne M.
Author_Institution :
Dept. of Comput. Sci., Missouri Univ. of Sci. & Technol., Rolla, MO
fYear :
2009
fDate :
March 30 2009-April 2 2009
Firstpage :
79
Lastpage :
86
Abstract :
With the increase of data from genome sequencing projects comes the need for reliable and efficient methods for the analysis and classification of protein motifs and domains. Experimental methods currently used to determine protein structure are accurate, yet expensive both in terms of time and equipment. Therefore, various computational approaches to solving the problem have been attempted, although their accuracy has rarely exceeded 75%. In this paper, a rule-based method to predict protein secondary structure is presented. This method uses a newly developed data-mining algorithm called RT-RICO (Relaxed Threshold Rule Induction from Coverings), which identifies dependencies between amino acids in a protein sequence, and generates rules that can be used to predict secondary structures. The average prediction accuracy on sample data sets, or Q3 score, using RT-RICO was 80.3%, an improvement over comparable computational methods.
Keywords :
data mining; molecular biophysics; molecular configurations; proteins; amino acids; data-mining algorithm; protein secondary structure prediction; protein sequence; relaxed threshold rule induction; rule induction; Accuracy; Amino acids; Bioinformatics; Genomics; Induction generators; Neural networks; Nuclear magnetic resonance; Probability; Protein engineering; Protein sequence;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology, 2009. CIBCB '09. IEEE Symposium on
Conference_Location :
Nashville, TN
Print_ISBN :
978-1-4244-2756-7
Type :
conf
DOI :
10.1109/CIBCB.2009.4925711
Filename :
4925711
Link To Document :
بازگشت