• DocumentCode
    1645648
  • Title

    Strip mining for molecules

  • Author

    Embrechts, Mark J. ; Arciniegas, F. ; Ozdemir, Muhsin ; Momma, M. ; Breneman, Curt M. ; Lockwood, L. ; Bennett, K.P. ; Kewley, R.H.

  • Author_Institution
    Dept. of Decision Sci. & Eng. Syst., Rensselaer Polytech. Inst., Troy, NY, USA
  • Volume
    1
  • fYear
    2002
  • fDate
    6/24/1905 12:00:00 AM
  • Firstpage
    305
  • Lastpage
    310
  • Abstract
    Quantitative structure-activity relationship (QSAR) problems deal with "in-silico" chemical design for the virtual invention of novel pharmaceuticals. The goal of QSAR is to predict the bioactivities of molecules based on a set of descriptive features. QSAR problems are notoriously challenging for machine learning because a typical QSAR predictive data mining problem set is characterized by a large number of descriptive features (300-1000), often for a relatively small number of molecules (50-300). This paper introduces data strip mining for QSAR modeling. Strip mining is a general approach for feature selection and predictive modeling based on successive stages of feature elimination done by performing a sensitivity analysis to a predictive model
  • Keywords
    chemical engineering computing; data mining; molecular biophysics; pharmaceutical industry; sensitivity analysis; QSAR; biomolecular activities; chemical design; data strip mining; feature selection; pharmaceuticals; predictive data mining; predictive model; quantitative structure activity relationship; sensitivity analysis; Chemicals; Data mining; Design engineering; Drugs; Human immunodeficiency virus; Pharmaceuticals; Predictive models; Principal component analysis; Sensitivity analysis; Strips;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2002. IJCNN '02. Proceedings of the 2002 International Joint Conference on
  • Conference_Location
    Honolulu, HI
  • ISSN
    1098-7576
  • Print_ISBN
    0-7803-7278-6
  • Type

    conf

  • DOI
    10.1109/IJCNN.2002.1005488
  • Filename
    1005488