DocumentCode
2074920
Title
TVscreen: Trend Vector Virtual SCREENing of Large Commercial Compounds Collections
Author
Plewczynski, Dariusz
Author_Institution
Interdiscipl. Centre for Math. & Comput. Modeling, Univ. of Warsaw, Warsaw
fYear
2008
fDate
June 29 2008-July 5 2008
Firstpage
59
Lastpage
63
Abstract
We present here the trend vector based method for identification of inhibitors for a given protein target. Therefore our approach reduces the number of compounds to be tested experimentally in costly validation studies, when some initial information about actives is already available. The machine learning method is trained here on compounds from the Elsevier Molecular Design Ltd (MDL) Information Systems´ drug data report (MDDR) for five diverse protein targets (cyclooxygenase-2, dihydrofolatereductase, thrombin, HIV-reverse transcriptase and antagonists of the estrogen receptor). Each classified ligand is represented using an optimized set of two dimensional topological descriptors. Then the trend vectors are used to divide the whole set of ligands into two groups: 1) molecules predicted to be active, and 2) those predicted to be inactive. Training and predicted activities were treated as binary. The accuracy of the method is comparable to other existing prediction tools (such as support vector machines, or random forest), whereas it provides significantly higher speed and portability. The accuracy of prediction (precision) reaches 60% on heterogeneous source data. As a consequence, the method can be easily applied to large commercial compounds collections.
Keywords
biology computing; enzymes; learning (artificial intelligence); molecular biophysics; HIV-reverse transcriptase; TVscreen; cyclooxygenase-2; dihydrofolatereductase; drug data report; estrogen receptor; large commercial compounds collections; ligand; machine learning; protein; thrombin; trend vector virtual screening; Bioinformatics; Databases; Drugs; Inhibitors; Learning systems; Machine learning algorithms; Proteins; Support vector machine classification; Support vector machines; Testing; Chemical Descriptors; Compound Identification; MDL Drug Data Report; Machine-learning methods; Trend Vectors; Virtual High-Throughput Screening;
fLanguage
English
Publisher
ieee
Conference_Titel
Biocomputation, Bioinformatics, and Biomedical Technologies, 2008. BIOTECHNO '08. International Conference on
Conference_Location
Bucharest
Print_ISBN
978-0-7695-3191-5
Electronic_ISBN
978-0-7695-3191-5
Type
conf
DOI
10.1109/BIOTECHNO.2008.15
Filename
4561135
Link To Document