DocumentCode :
3348608
Title :
Pre-processing aspects for complexity reduction of the QSAR problem
Author :
Dumitriu, L. ; Segal, C. ; Craciun, M.-V. ; Cocu, A.
Author_Institution :
Comput. Sci. Dept., Dunarea de Jos Univ., Galati
Volume :
2
fYear :
2008
fDate :
6-8 Sept. 2008
Abstract :
Predictive Toxicology (PT) is one of the newest targets of the Knowledge Discovery in Databases (KDD) domain. Its goal is to describe the relationships between the chemical structure of chemical compounds and biological and toxicological processes. In real PT problems there is a very important topic to be considered: the huge number of the chemical descriptors. Irrelevant, redundant, noisy and unreliable data have a negative impact, therefore one of the main goals in KDD is to detect these undesirable proprieties and to eliminate or correct them. This assumes data cleaning, noise reduction and feature selection because the performance of the applied Machine Learning algorithms is strongly related with the quality of the data used. In this paper, we present some of the issues that can be taken into account for preparing data before the actual knowledge discovery is performed.
Keywords :
chemistry computing; data mining; learning (artificial intelligence); toxicology; QSAR problem; chemical structure; complexity reduction; data cleaning; feature selection; knowledge discovery; machine learning; noise reduction; predictive toxicology; preprocessing aspects; Chemical compounds; Data mining; Databases; Neural networks; Noise reduction; Pattern analysis; Pattern recognition; Principal component analysis; Statistical analysis; Toxicology; knowledge discovery in databases; prediction; toxicology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems, 2008. IS '08. 4th International IEEE Conference
Conference_Location :
Varna
Print_ISBN :
978-1-4244-1739-1
Electronic_ISBN :
978-1-4244-1740-7
Type :
conf
DOI :
10.1109/IS.2008.4670547
Filename :
4670547
Link To Document :
بازگشت