DocumentCode
3348608
Title
Pre-processing aspects for complexity reduction of the QSAR problem
Author
Dumitriu, L. ; Segal, C. ; Craciun, M.-V. ; Cocu, A.
Author_Institution
Comput. Sci. Dept., Dunarea de Jos Univ., Galati
Volume
2
fYear
2008
fDate
6-8 Sept. 2008
Abstract
Predictive Toxicology (PT) is one of the newest targets of the Knowledge Discovery in Databases (KDD) domain. Its goal is to describe the relationships between the chemical structure of chemical compounds and biological and toxicological processes. In real PT problems there is a very important topic to be considered: the huge number of the chemical descriptors. Irrelevant, redundant, noisy and unreliable data have a negative impact, therefore one of the main goals in KDD is to detect these undesirable proprieties and to eliminate or correct them. This assumes data cleaning, noise reduction and feature selection because the performance of the applied Machine Learning algorithms is strongly related with the quality of the data used. In this paper, we present some of the issues that can be taken into account for preparing data before the actual knowledge discovery is performed.
Keywords
chemistry computing; data mining; learning (artificial intelligence); toxicology; QSAR problem; chemical structure; complexity reduction; data cleaning; feature selection; knowledge discovery; machine learning; noise reduction; predictive toxicology; preprocessing aspects; Chemical compounds; Data mining; Databases; Neural networks; Noise reduction; Pattern analysis; Pattern recognition; Principal component analysis; Statistical analysis; Toxicology; knowledge discovery in databases; prediction; toxicology;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems, 2008. IS '08. 4th International IEEE Conference
Conference_Location
Varna
Print_ISBN
978-1-4244-1739-1
Electronic_ISBN
978-1-4244-1740-7
Type
conf
DOI
10.1109/IS.2008.4670547
Filename
4670547
Link To Document