Title :
Predicting chemical activities from structures by attributed molecular graph classification
Author :
Xu, Qian ; Hu, Derek Hao ; Xue, Hong ; Yang, Qiang
Author_Institution :
Hong Kong Univ. of Sci. & Technol., Kowloon, China
Abstract :
Designing Quantitative Structure-Activity Relationship (QSAR) models has been a recurrent research interest for biologists and computer scientists. An example is to predict the toxicity of chemical compounds using their structural properties as features represented by graphs. A popular method to classify these graphs is to exploit classifiers such as support vector machines (SVMs) and graph kernels to incorporate the sequential, structural and chemical information. Previous works have focused on designing specific graph kernels for this task, amongst which graph alignment kernels are one of the most popular approach. Graph alignment kernels align the nodes of one graph to the nodes of the second graph so that the total overall similarity is maximized with respect to all possible alignments. However, taking both vertex and edge similarities into account makes the problem NP-Hard. In this paper, we present a novel general graph-matching based method for QSAR. We view the problem of calculating optimal assignments of two attributed graphs from a different perspective. Instead of first designing an atom kernel function and a bond kernel function, we first provide a training set of pairs of graphs with their corresponding matchings. We then try to learn the compatibility function over atoms and use only the atom kernel function to compute graph matchings. Our algorithm has the advantage of being more general and yet efficient than previous approaches for the QSAR problem. We evaluate our method on a set of chemical structure-activity prediction benchmark datasets, and show that our algorithm can achieve better or comparable accuracies over the optimal assignment kernel method.
Keywords :
chemistry computing; graph theory; pattern classification; support vector machines; NP-hard problem; atom kernel function; attributed molecular graph classification; bond kernel function; chemical activity prediction; chemical information; graph alignment kernels; graph-matching; quantitative structure-activity relationship; sequential information; structural information; support vector machines; Biology computing; Chemical compounds; Decision trees; Drugs; Kernel; Logistics; Machine learning algorithms; Regression tree analysis; Support vector machines; Training data;
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2010 IEEE Symposium on
Conference_Location :
Montreal, QC
Print_ISBN :
978-1-4244-6766-2
DOI :
10.1109/CIBCB.2010.5510690