• DocumentCode
    2074920
  • Title

    TVscreen: Trend Vector Virtual SCREENing of Large Commercial Compounds Collections

  • Author

    Plewczynski, Dariusz

  • Author_Institution
    Interdiscipl. Centre for Math. & Comput. Modeling, Univ. of Warsaw, Warsaw
  • fYear
    2008
  • fDate
    June 29 2008-July 5 2008
  • Firstpage
    59
  • Lastpage
    63
  • Abstract
    We present here the trend vector based method for identification of inhibitors for a given protein target. Therefore our approach reduces the number of compounds to be tested experimentally in costly validation studies, when some initial information about actives is already available. The machine learning method is trained here on compounds from the Elsevier Molecular Design Ltd (MDL) Information Systems´ drug data report (MDDR) for five diverse protein targets (cyclooxygenase-2, dihydrofolatereductase, thrombin, HIV-reverse transcriptase and antagonists of the estrogen receptor). Each classified ligand is represented using an optimized set of two dimensional topological descriptors. Then the trend vectors are used to divide the whole set of ligands into two groups: 1) molecules predicted to be active, and 2) those predicted to be inactive. Training and predicted activities were treated as binary. The accuracy of the method is comparable to other existing prediction tools (such as support vector machines, or random forest), whereas it provides significantly higher speed and portability. The accuracy of prediction (precision) reaches 60% on heterogeneous source data. As a consequence, the method can be easily applied to large commercial compounds collections.
  • Keywords
    biology computing; enzymes; learning (artificial intelligence); molecular biophysics; HIV-reverse transcriptase; TVscreen; cyclooxygenase-2; dihydrofolatereductase; drug data report; estrogen receptor; large commercial compounds collections; ligand; machine learning; protein; thrombin; trend vector virtual screening; Bioinformatics; Databases; Drugs; Inhibitors; Learning systems; Machine learning algorithms; Proteins; Support vector machine classification; Support vector machines; Testing; Chemical Descriptors; Compound Identification; MDL Drug Data Report; Machine-learning methods; Trend Vectors; Virtual High-Throughput Screening;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Biocomputation, Bioinformatics, and Biomedical Technologies, 2008. BIOTECHNO '08. International Conference on
  • Conference_Location
    Bucharest
  • Print_ISBN
    978-0-7695-3191-5
  • Electronic_ISBN
    978-0-7695-3191-5
  • Type

    conf

  • DOI
    10.1109/BIOTECHNO.2008.15
  • Filename
    4561135