• DocumentCode
    1640639
  • Title

    Protein similarity networks and Genetic Algorithm driven feature selection for fold recognition

  • Author

    Valavanis, Joannis K. ; Spyrou, George M. ; Nikita, Konstantina S.

  • Author_Institution
    Sch. of Electr. & Comput. Eng., Nat. Tech. Univ. of Athens, Athens
  • fYear
    2008
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Fold recognition based on sequence-derived features is a complex classification problem and usually sequence-derived features are exploited using proper machine learning techniques. Here we adress the task of fold recognition on a protein similarity network (PSN) basis. We construct a protein sequence similarity network (PSeSN) using a set of 125 sequence-derived features for an available set of 311 proteins. PSeSN is optimized by using a Genetic Algorithm (GA) to select the features that construct a PSeSN which is as similar as possible with the corresponding protein structure similarity network (PStSN). A random walk based algorithm is then utilized to recognize the fold of a query protein sequence by calculating its affinities to sequences-vertices both in the initial and the optimized PSeSN. Total accuracy (TA) measurements obtained using 10-fold cross validation show that the use of 48 out of 125 sequence-derived features (optimized PSeSN) yielded better results (mean TA: 0.35 in testing sets) than the initial PSeSN (mean TA: 0.316 in testing sets).
  • Keywords
    biology computing; expert systems; feature extraction; genetic algorithms; learning (artificial intelligence); proteins; random processes; feature selection; fold recognition; genetic algorithm; machine learning technique; protein sequence similarity network; protein similarity networks; protein structure similarity network; random walk based algorithm; sequence derived features; Biomedical computing; Biomedical engineering; Classification tree analysis; Data mining; Decision trees; Genetic algorithms; Neural networks; Nuclear magnetic resonance; Protein sequence; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    BioInformatics and BioEngineering, 2008. BIBE 2008. 8th IEEE International Conference on
  • Conference_Location
    Athens
  • Print_ISBN
    978-1-4244-2844-1
  • Electronic_ISBN
    978-1-4244-2845-8
  • Type

    conf

  • DOI
    10.1109/BIBE.2008.4696704
  • Filename
    4696704