• DocumentCode
    909213
  • Title

    Fuzzy ARTMAP Prediction of Biological Activities for Potential HIV-1 Protease Inhibitors Using a Small Molecular Data Set

  • Author

    Andonie, Razvan ; Fabry-Asztalos, Levente ; Abdul-Wahid, Christopher Badi ; Abdul-Wahid, Sarah ; Barker, Grant I. ; Magill, Lukas C.

  • Author_Institution
    Comput. Sci. Dept., Central Washington Univ., Ellensburg, WA, USA
  • Volume
    8
  • Issue
    1
  • fYear
    2011
  • Firstpage
    80
  • Lastpage
    93
  • Abstract
    Obtaining satisfactory results with neural networks depends on the availability of large data samples. The use of small training sets generally reduces performance. Most classical Quantitative Structure-Activity Relationship (QSAR) studies for a specific enzyme system have been performed on small data sets. We focus on the neuro-fuzzy prediction of biological activities of HIV-1 protease inhibitory compounds when inferring from small training sets. We propose two computational intelligence prediction techniques which are suitable for small training sets, at the expense of some computational overhead. Both techniques are based on the FAMR model. The FAMR is a Fuzzy ARTMAP (FAM) incremental learning system used for classification and probability estimation. During the learning phase, each sample pair is assigned a relevance factor proportional to the importance of that pair. The two proposed algorithms in this paper are: 1) The GA-FAMR algorithm, which is new, consists of two stages: a) During the first stage, we use a genetic algorithm (GA) to optimize the relevances assigned to the training data. This improves the generalization capability of the FAMR. b) In the second stage, we use the optimized relevances to train the FAMR. 2) The Ordered FAMR is derived from a known algorithm. Instead of optimizing relevances, it optimizes the order of data presentation using the algorithm of Dagher et al. In our experiments, we compare these two algorithms with an algorithm not based on the FAM, the FS-GA-FNN introduced in . We conclude that when inferring from small training sets, both techniques are efficient, in terms of generalization capability and execution time. The computational overhead introduced is compensated by better accuracy. Finally, the proposed techniques are used to predict the biological activities of newly designed potential HIV-1 protease inhibitors.
  • Keywords
    QSAR; bioinformatics; enzymes; fuzzy logic; genetic algorithms; FS-GA-FNN; GA-FAMR algorithm; HIV-1 protease inhibitor; Quantitative Structure-Activity Relationship; biological activity; enzyme; fuzzy ARTMAP prediction; molecular data set; neurofuzzy prediction; probability estimation; Biological system modeling; Biology computing; Chemistry; Computer networks; Computer science; Data mining; Fuzzy neural networks; Genetic algorithms; Inhibitors; Neural networks; Fuzzy neural networks; computational chemistry; data mining.; evolutionary computing and genetic algorithms; Algorithms; Computational Biology; Data Mining; Databases, Factual; Drug Discovery; Fuzzy Logic; HIV Protease Inhibitors; Linear Models; Models, Genetic; Models, Statistical; Neural Networks (Computer); Physicochemical Phenomena; Quantitative Structure-Activity Relationship; Reproducibility of Results;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2009.50
  • Filename
    4967569