DocumentCode
909213
Title
Fuzzy ARTMAP Prediction of Biological Activities for Potential HIV-1 Protease Inhibitors Using a Small Molecular Data Set
Author
Andonie, Razvan ; Fabry-Asztalos, Levente ; Abdul-Wahid, Christopher Badi ; Abdul-Wahid, Sarah ; Barker, Grant I. ; Magill, Lukas C.
Author_Institution
Comput. Sci. Dept., Central Washington Univ., Ellensburg, WA, USA
Volume
8
Issue
1
fYear
2011
Firstpage
80
Lastpage
93
Abstract
Obtaining satisfactory results with neural networks depends on the availability of large data samples. The use of small training sets generally reduces performance. Most classical Quantitative Structure-Activity Relationship (QSAR) studies for a specific enzyme system have been performed on small data sets. We focus on the neuro-fuzzy prediction of biological activities of HIV-1 protease inhibitory compounds when inferring from small training sets. We propose two computational intelligence prediction techniques which are suitable for small training sets, at the expense of some computational overhead. Both techniques are based on the FAMR model. The FAMR is a Fuzzy ARTMAP (FAM) incremental learning system used for classification and probability estimation. During the learning phase, each sample pair is assigned a relevance factor proportional to the importance of that pair. The two proposed algorithms in this paper are: 1) The GA-FAMR algorithm, which is new, consists of two stages: a) During the first stage, we use a genetic algorithm (GA) to optimize the relevances assigned to the training data. This improves the generalization capability of the FAMR. b) In the second stage, we use the optimized relevances to train the FAMR. 2) The Ordered FAMR is derived from a known algorithm. Instead of optimizing relevances, it optimizes the order of data presentation using the algorithm of Dagher et al. In our experiments, we compare these two algorithms with an algorithm not based on the FAM, the FS-GA-FNN introduced in . We conclude that when inferring from small training sets, both techniques are efficient, in terms of generalization capability and execution time. The computational overhead introduced is compensated by better accuracy. Finally, the proposed techniques are used to predict the biological activities of newly designed potential HIV-1 protease inhibitors.
Keywords
QSAR; bioinformatics; enzymes; fuzzy logic; genetic algorithms; FS-GA-FNN; GA-FAMR algorithm; HIV-1 protease inhibitor; Quantitative Structure-Activity Relationship; biological activity; enzyme; fuzzy ARTMAP prediction; molecular data set; neurofuzzy prediction; probability estimation; Biological system modeling; Biology computing; Chemistry; Computer networks; Computer science; Data mining; Fuzzy neural networks; Genetic algorithms; Inhibitors; Neural networks; Fuzzy neural networks; computational chemistry; data mining.; evolutionary computing and genetic algorithms; Algorithms; Computational Biology; Data Mining; Databases, Factual; Drug Discovery; Fuzzy Logic; HIV Protease Inhibitors; Linear Models; Models, Genetic; Models, Statistical; Neural Networks (Computer); Physicochemical Phenomena; Quantitative Structure-Activity Relationship; Reproducibility of Results;
fLanguage
English
Journal_Title
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher
ieee
ISSN
1545-5963
Type
jour
DOI
10.1109/TCBB.2009.50
Filename
4967569
Link To Document