DocumentCode :
2319935
Title :
Bcl∷ChemInfo - Qualitative analysis of machine learning models for activation of HSD involved in Alzheimer´s Disease
Author :
Butkiewicz, Mariusz ; Lowe, Edward W., Jr. ; Meiler, Jens
Author_Institution :
Chem., Vanderbilt Univ., Nashville, TN, USA
fYear :
2012
fDate :
9-12 May 2012
Firstpage :
329
Lastpage :
334
Abstract :
In this case study, a ligand-based virtual high throughput screening suite, bcl::ChemInfo, was applied to screen for activation of the protein target 17-beta hydroxysteroid dehydrogenase type 10 (HSD) involved in Alzheimer´s Disease. bcl::ChemInfo implements a diverse set of machine learning techniques such as artificial neural networks (ANN), support vector machines (SVM) with the extension for regression, kappa nearest neighbor (KNN), and decision trees (DT). Molecular structures were converted into a distinct collection of descriptor groups involving 2D- and 3D-autocorrelation, and radial distribution functions. A confirmatory high-throughput screening data set contained over 72,000 experimentally validated compounds, available through PubChem. Here, the systematical model development was achieved through optimization of feature sets and algorithmic parameters resulting in a theoretical enrichment of 11 (44% of maximal enrichment), and an area under the ROC curve (AUC) of 0.75 for the best performing machine learning technique on an independent data set. In addition, consensus combinations of all involved predictors were evaluated and achieved the best enrichment of 13 (50%), and AUC of 0.86. All models were computed in silico and represent a viable option in guiding the drug discovery process through virtual library screening and compound prioritization a priori to synthesis and biological testing. The best consensus predictor will be made accessible for the academic community at www.meilerlab.org.
Keywords :
biochemistry; bioinformatics; decision trees; diseases; drugs; learning (artificial intelligence); medical computing; molecular biophysics; neural nets; optimisation; proteins; regression analysis; support vector machines; 17-beta hydroxysteroid dehydrogenase type 10; 2D-autocorrelation; 3D-autocorrelation; ANN; Alzheimer disease; HSD activation; KNN; PubChem; SVM; algorithmic parameter; artificial neural networks; bcl::ChemInfo suite; compound prioritization; decision trees; descriptor groups; drug discovery process; feature set optimization; kappa nearest neighbor; machine learning model; molecular structures; protein target activation; radial distribution function; regression analysis; support vector machines; systematical model development; virtual high throughput screening suite; virtual library screening; Alzheimer´s disease; Artificial neural networks; Biology; Compounds; Machine learning; Support vector machines; Area under the curve (AUC); Artificial Neural Network (ANN); Decision Trees (DT); Enrichment; Kohonen Network; Machine Learning; Quantitative Structure Activity Relation (QSAR); Receiver Operator Characteristics (ROC); Support Vector Machine (SVM); high-throughput screening (HTS); kappa - Nearest Neighbor (KNN);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2012 IEEE Symposium on
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4673-1190-8
Type :
conf
DOI :
10.1109/CIBCB.2012.6217248
Filename :
6217248
Link To Document :
بازگشت