DocumentCode :
2845060
Title :
Secondary structure prediction using SVM and clustering
Author :
Doong, Shing H. ; Yeh, Chi Y.
Author_Institution :
Dept. of Inf. Manage., ShuTe Univ., Kaohsiung, Taiwan
fYear :
2004
fDate :
5-8 Dec. 2004
Firstpage :
297
Lastpage :
302
Abstract :
Protein secondary structure can be used to help determine the tertiary structure via the fold recognition method. Predicting the secondary structure from the protein sequence has attracted the attention of many researchers. Support vector machine (SVM) is a new learning algorithm that has been successfully applied to many prediction problems. However, the algorithm takes a long time to train the prediction model when a large data set is present. It becomes important to revise the method so that the time performance is improved while the accuracy performance is maintained. In this study, we implement a genetic algorithm to cluster the training set before a prediction model is built. Using position specific scoring matrix (PSSM) as part of the input, the hybrid method achieves good performances on sets of 513 nonredundant protein sequences and 294 partially redundant sequences. The results also show that clustering achieves the goal of data preprocessing differently on redundant and nonredundant sets, and it seems almost preferable to cluster the data before prediction is preformed.
Keywords :
biology computing; genetic algorithms; learning (artificial intelligence); pattern clustering; proteins; sequences; support vector machines; SVM; genetic algorithm; pattern clustering; position specific scoring matrix; protein secondary structure prediction; protein sequences; support vector machine; training set; Accuracy; Artificial neural networks; Clustering algorithms; Encoding; Information management; Machine learning algorithms; Prediction algorithms; Predictive models; Proteins; Support vector machines; Secondary structure prediction; clustering; position specific scoring matrix; support vector machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Hybrid Intelligent Systems, 2004. HIS '04. Fourth International Conference on
Print_ISBN :
0-7695-2291-2
Type :
conf
DOI :
10.1109/ICHIS.2004.84
Filename :
1410020
Link To Document :
بازگشت