DocumentCode :
1458437
Title :
Using Kernel Alignment to Select Features of Molecular Descriptors in a QSAR Study
Author :
Wong, William W L ; Burkowski, Forbes J.
Author_Institution :
Toronto Health Econ. & Technol. Assessment Collaborative (THETA), Univ. of Toronto, Toronto, ON, Canada
Volume :
8
Issue :
5
fYear :
2011
Firstpage :
1373
Lastpage :
1384
Abstract :
Quantitative structure-activity relationships (QSARs) correlate biological activities of chemical compounds with their physicochemical descriptors. By modeling the observed relationship seen between molecular descriptors and their corresponding biological activities, we may predict the behavior of other molecules with similar descriptors. In QSAR studies, it has been shown that the quality of the prediction model strongly depends on the selected features within molecular descriptors. Thus, methods capable of automatic selection of relevant features are very desirable. In this paper, we present a new feature selection algorithm for a QSAR study based on kernel alignment which has been used as a measure of similarity between two kernel functions. In our algorithm, we deploy kernel alignment as an evaluation tool, using recursive feature elimination to compute a molecular descriptor containing the most important features needed for a classification application. Empirical results show that the algorithm works well for the computation of descriptors for various applications involving different QSAR data sets. The prediction accuracies are substantially increased and are comparable to those from earlier studies.
Keywords :
biochemistry; biological techniques; biology computing; chemistry computing; molecular biophysics; molecular configurations; QSAR feature selection algorithm; automatic feature selection; chemical compound biological activities; kernel alignment; molecular descriptor feature selection; physicochemical descriptors; prediction model quality; quantitative structure-activity relationships; recursive feature elimination; Accuracy; Algorithm design and analysis; Classification algorithms; Kernel; Prediction algorithms; Support vector machines; Training; Feature selection; kernel alignment; quantitative structure-activity relationship (QSAR).; Angiotensin-Converting Enzyme Inhibitors; Computational Biology; Databases, Factual; Humans; Intestinal Absorption; Models, Molecular; P-Glycoprotein; Pharmaceutical Preparations; Quantitative Structure-Activity Relationship; Support Vector Machines; Torsades de Pointes;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2011.31
Filename :
5719599
Link To Document :
بازگشت