Title :
DM-pred Method: A New Method to Predict Secondary Structures Based on Data Mining Techniques
Author :
Fayech, Sondès ; Essoussi, Nadia ; Limam, Mohamed
Author_Institution :
LARODEC, Univ. of Tunis, Tunis, Tunisia
fDate :
Aug. 29 2011-Sept. 2 2011
Abstract :
Protein secondary structure prediction is a key step in prediction of protein tertiary structure. There have emerged many methods based on machine learning techniques, such as neural networks (NN) and support vector machines (SVM), to focus on the prediction of the secondary structures. In this paper a new method, DM-pred, was proposed based on a protein clustering method to detect homologous sequences, a sequential pattern mining method to detect frequent patterns, features extraction and quantification approaches to prepare features and SVM method to predict structures. When tested on the most popular secondary structure datasets, DM-pred achieved a Q3 accuracy of 78.20% and a SOV of 76.49% which illustrates that it is one of the top range methods for protein secondary structure prediction.
Keywords :
bioinformatics; data mining; learning (artificial intelligence); neural nets; pattern clustering; proteins; support vector machines; DM-pred method; SVM method; data mining; homologous sequence; machine learning; neural network; protein clustering; protein tertiary structure; secondary structure prediction; sequential pattern mining; support vector machine; Amino acids; Data mining; Databases; Protein sequence; Support vector machines; Training; SVM; clustering; features; protein secondary structure prediction; sequential pattern mining;
Conference_Titel :
Database and Expert Systems Applications (DEXA), 2011 22nd International Workshop on
Conference_Location :
Toulouse
Print_ISBN :
978-1-4577-0982-1
DOI :
10.1109/DEXA.2011.27