• DocumentCode
    1458437
  • Title

    Using Kernel Alignment to Select Features of Molecular Descriptors in a QSAR Study

  • Author

    Wong, William W L ; Burkowski, Forbes J.

  • Author_Institution
    Toronto Health Econ. & Technol. Assessment Collaborative (THETA), Univ. of Toronto, Toronto, ON, Canada
  • Volume
    8
  • Issue
    5
  • fYear
    2011
  • Firstpage
    1373
  • Lastpage
    1384
  • Abstract
    Quantitative structure-activity relationships (QSARs) correlate biological activities of chemical compounds with their physicochemical descriptors. By modeling the observed relationship seen between molecular descriptors and their corresponding biological activities, we may predict the behavior of other molecules with similar descriptors. In QSAR studies, it has been shown that the quality of the prediction model strongly depends on the selected features within molecular descriptors. Thus, methods capable of automatic selection of relevant features are very desirable. In this paper, we present a new feature selection algorithm for a QSAR study based on kernel alignment which has been used as a measure of similarity between two kernel functions. In our algorithm, we deploy kernel alignment as an evaluation tool, using recursive feature elimination to compute a molecular descriptor containing the most important features needed for a classification application. Empirical results show that the algorithm works well for the computation of descriptors for various applications involving different QSAR data sets. The prediction accuracies are substantially increased and are comparable to those from earlier studies.
  • Keywords
    biochemistry; biological techniques; biology computing; chemistry computing; molecular biophysics; molecular configurations; QSAR feature selection algorithm; automatic feature selection; chemical compound biological activities; kernel alignment; molecular descriptor feature selection; physicochemical descriptors; prediction model quality; quantitative structure-activity relationships; recursive feature elimination; Accuracy; Algorithm design and analysis; Classification algorithms; Kernel; Prediction algorithms; Support vector machines; Training; Feature selection; kernel alignment; quantitative structure-activity relationship (QSAR).; Angiotensin-Converting Enzyme Inhibitors; Computational Biology; Databases, Factual; Humans; Intestinal Absorption; Models, Molecular; P-Glycoprotein; Pharmaceutical Preparations; Quantitative Structure-Activity Relationship; Support Vector Machines; Torsades de Pointes;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2011.31
  • Filename
    5719599