• DocumentCode
    2845060
  • Title

    Secondary structure prediction using SVM and clustering

  • Author

    Doong, Shing H. ; Yeh, Chi Y.

  • Author_Institution
    Dept. of Inf. Manage., ShuTe Univ., Kaohsiung, Taiwan
  • fYear
    2004
  • fDate
    5-8 Dec. 2004
  • Firstpage
    297
  • Lastpage
    302
  • Abstract
    Protein secondary structure can be used to help determine the tertiary structure via the fold recognition method. Predicting the secondary structure from the protein sequence has attracted the attention of many researchers. Support vector machine (SVM) is a new learning algorithm that has been successfully applied to many prediction problems. However, the algorithm takes a long time to train the prediction model when a large data set is present. It becomes important to revise the method so that the time performance is improved while the accuracy performance is maintained. In this study, we implement a genetic algorithm to cluster the training set before a prediction model is built. Using position specific scoring matrix (PSSM) as part of the input, the hybrid method achieves good performances on sets of 513 nonredundant protein sequences and 294 partially redundant sequences. The results also show that clustering achieves the goal of data preprocessing differently on redundant and nonredundant sets, and it seems almost preferable to cluster the data before prediction is preformed.
  • Keywords
    biology computing; genetic algorithms; learning (artificial intelligence); pattern clustering; proteins; sequences; support vector machines; SVM; genetic algorithm; pattern clustering; position specific scoring matrix; protein secondary structure prediction; protein sequences; support vector machine; training set; Accuracy; Artificial neural networks; Clustering algorithms; Encoding; Information management; Machine learning algorithms; Prediction algorithms; Predictive models; Proteins; Support vector machines; Secondary structure prediction; clustering; position specific scoring matrix; support vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Hybrid Intelligent Systems, 2004. HIS '04. Fourth International Conference on
  • Print_ISBN
    0-7695-2291-2
  • Type

    conf

  • DOI
    10.1109/ICHIS.2004.84
  • Filename
    1410020