• DocumentCode
    2989477
  • Title

    Machine learning based search space optimisation for drug discovery

  • Author

    Senanayake, Upul ; Prabuddha, Rahal ; Ragel, Roshan

  • Author_Institution
    Dept. of Comput. Eng., Univ. of Peradeniya, Peradeniya, Sri Lanka
  • fYear
    2013
  • fDate
    16-19 April 2013
  • Firstpage
    68
  • Lastpage
    75
  • Abstract
    Drug discovery research has progressed to a place where it essentially counts on high performance computer systems and huge databases for its victory. As such, Virtual Screening (VS), a computationally intensive process, plays a major role in the systematic drug designing process for pressing diseases. Therefore, it is imperative that the VS process has to be made as fast as possible in order to efficiently dock the ligands from huge databases to a selected protein receptor, targeting a drug. The extremely high rate of increase of the number of ligands in the databases makes it impossible to tackle this problem only by improving the computing resources. Therefore, researchers work on an orthogonal technique, where they use soft computing to reduce the search space through identifying the ligands that are non-dockable, hence improving the throughput as a whole. Machine Learning (ML) can be used to train a binary classifier that can classify the ligands into two known classes: dockable and non-dockable ligands. In this paper, for the first time, we use three ML techniques (Support Vector Machines, Artificial Neural Networks and Random Forest) on a single problem domain (a Protease receptor of HIV) and evaluate the performance rendered by the respective models. We show that such classification improves the throughput by two folds with around 90% accuracy. In addition, we propose and use a technique for constructing a training set to be used for ML in VS applications in the instance of a non-synthesised receptor.
  • Keywords
    biology computing; drugs; learning (artificial intelligence); neural nets; proteins; support vector machines; HIV protease receptor; ML; artificial neural network; disease; dockable ligand; drug designing process; drug discovery; machine learning; nondockable ligand; orthogonal technique; protein receptor; random forest; search space optimisation; soft computing; support vector machine; virtual screening; Accuracy; Databases; Drugs; Machine learning algorithms; Proteins; Support vector machines; Training; Artificial Neural Networks; Autodock Vina; Machine Learning; Random Forest; Support Vector Machines; Virtual Screening;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2013 IEEE Symposium on
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/CIBCB.2013.6595390
  • Filename
    6595390