DocumentCode :
614876
Title :
Data mining techniques to predict protein secondary structures
Author :
Fayech, Sondes ; Essoussi, Nadia ; Limam, Mohamed
Author_Institution :
LARODEC, Univ. of Tunis, Tunis, Tunisia
fYear :
2013
fDate :
28-30 April 2013
Firstpage :
1
Lastpage :
5
Abstract :
Protein secondary structure prediction is a key step in prediction of protein tertiary structure. There have emerged many methods based on machine learning techniques, such as neural networks (NN) and support vector machines (SVM), to focus on the prediction of the secondary structures. In this paper a new method, DM-pred, was proposed based on a protein clustering method to detect homologous sequences, a sequential pattern mining method to detect frequent patterns, features extraction and quantification approaches to prepare features and SVM method to predict structures. When tested on the most popular secondary structure datasets, DM-pred achieved a Q3 accuracy of 78.20% and a SOV of 76.49% which illustrates that it is one of the top range methods for protein secondary structure prediction.
Keywords :
bioinformatics; data mining; feature extraction; learning (artificial intelligence); neural nets; pattern clustering; proteins; support vector machines; DM-pred method; SVM method; data mining technique; feature extraction; frequent pattern detection; homologous sequence detection; machine learning technique; neural network; protein clustering method; protein secondary structure prediction; protein tertiary structure; quantification approach; sequential pattern mining method; support vector machine; Amino acids; Data mining; Databases; Protein sequence; Support vector machines; Training; SVM; clustering; features; protein secondary structure prediction; sequential pattern mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Modeling, Simulation and Applied Optimization (ICMSAO), 2013 5th International Conference on
Conference_Location :
Hammamet
Print_ISBN :
978-1-4673-5812-5
Type :
conf
DOI :
10.1109/ICMSAO.2013.6552701
Filename :
6552701
Link To Document :
بازگشت