DocumentCode :
2319904
Title :
GPU-accelerated machine learning techniques enable QSAR modeling of large HTS data
Author :
Lowe, Edward W., Jr. ; Butkiewicz, Mariusz ; Woetzel, Nils ; Meiler, Jens
Author_Institution :
Chem., Vanderbilt Univ., Nashville, TN, USA
fYear :
2012
fDate :
9-12 May 2012
Firstpage :
314
Lastpage :
320
Abstract :
Quantitative structure activity relationship (QSAR) modeling using high-throughput screening (HTS) data is a powerful technique which enables the construction of predictive models. These models are utilized for the in silico screening of libraries of molecules for which experimental screening methods are both cost- and time-expensive. Machine learning techniques excel in QSAR modeling where the relationship between structure and activity is often complex and non-linear. As these HTS data sets continue to increase in number of compounds screened, extensive feature selection and cross validation becomes computationally expensive. Leveraging massively parallel architectures such as graphics processing units (GPUs) to accelerate the training algorithms for these machine learning techniques is a cost-efficient manner in which to combat this problem. In this work, several machine learning techniques are ported in OpenCL for GPU-acceleration to enable construction of QSAR ensemble models using HTS data. We report computational performance numbers using several HTS data sets freely available from PubChem database. We also report results of a case study using HTS data for a target of pharmacological and pharmaceutical relevance, cytochrome P450 3A4, for which an enrichment of 94% of the theoretical maximum is achieved.
Keywords :
bioinformatics; biological techniques; graphics processing units; learning (artificial intelligence); molecular biophysics; molecular configurations; parallel architectures; proteins; GPU accelerated machine learning techniques; OpenCL; QSAR modeling; computational performance; cross validation; cytochrome P450 3A4; feature selection; graphics processing units; high throughput screening data; in silico molecular library screening; large HTS data; massively parallel architectures; pharmaceutical target; pharmacological target; predictive model construction; quantitative structure activity relationship modeling; training algorithm acceleration; Artificial neural networks; Graphics processing unit; High temperature superconductors; Machine learning; Predictive models; Support vector machines; Training; GPU-Accelerated; High-throughput screening; Machine learning; OpenCL; quantitative structure activity relationship;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2012 IEEE Symposium on
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4673-1190-8
Type :
conf
DOI :
10.1109/CIBCB.2012.6217246
Filename :
6217246
Link To Document :
بازگشت