مرکز منطقه ای اطلاع رساني علوم و فناوري - Classification of molecular structures made easy

DocumentCode :

2961840

Title :

Classification of molecular structures made easy

Author :

Trentin, Edmondo ; Iorio, Ernesto Di

Author_Institution :

Dipt. di Ing. dell´´Inf., Univ. di Siena, Siena

fYear :

2008

fDate :

1-8 June 2008

Firstpage :

3241

Lastpage :

3246

Abstract :

Several problems in bioinformatics and cheminformatics concern the classification of molecules. Relevant instances are automatic cancer detection/classification, machine-learning pathologic prediction, automatic predictive toxicology, etc. Molecules may be represented in terms of graphical structures in a natural way: each node in the graph can be used to represent an atom, whilst the edges of the graph represent the atom-atom bonds. Labels (in the form of real-valued vectors) are associated with nodes and edges in order to express physical and chemical properties of the corresponding atoms and bonds, respectively. These structured data are expected to contain more information than a traditional (flat) feature vector, information that may strengthen the classification capabilities of a machine learner. This paper investigates the application of a novel Bayesian/connectionist classifier to this graphical pattern recognition task. The approach is much simpler than state-of-the-art machine learning paradigms for graphical/relational learning. It relies on the idea of describing the graph in terms of a binary relation. The posterior probability of a class given the relation is estimated as a function of probabilistic quantities modeled with a neural network, trained over individual vertex pairs in the graph. The popular and challenging Mutagenesis dataset is considered for the experimental evaluation. Despite its simplicity, the technique turns out to yield the highest recognition accuracies to date on the complete (friendly + unfriendly) dataset, outperforming complex machines (relational and graph neural nets, kernels for graphs, inductive logic programming techniques, etc.). Some preliminary chemical/biological implications are eventually hypothesized in the light of the results obtained.

Keywords :

Bayes methods; biology computing; graph theory; molecular biophysics; pattern classification; probability; Bayesian/connectionist classifier; Mutagenesis dataset; bioinformatics; cheminformatics; graphical pattern recognition; molecular structure; molecules classification; neural network; posterior probability; Bayesian methods; Bioinformatics; Cancer detection; Chemicals; Kernel; Logic programming; Machine learning; Neural networks; Pattern recognition; Toxicology;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on

Conference_Location :

Hong Kong

ISSN :

1098-7576

Print_ISBN :

978-1-4244-1820-6

Electronic_ISBN :

1098-7576

Type :

conf

DOI :

10.1109/IJCNN.2008.4634258

Filename :

4634258

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2961840