Title :
Grammatically based genetic programming for mining relational databases
Author :
Ishida, Celso Y. ; Pozo, Aurora
Author_Institution :
Comput. Sci. Dept., Fed. Univ. of Parana, Brazil
Abstract :
Knowledge discovery is the most desirable end product of an enterprise information system. Researches from different areas recognize that a new generation of intelligent tools for automated data mining is needed. In this sense, this paper explores the grammatically based genetic programming (GGP) approach for mining relational databases, specifically for the classification task. Genetic programming is a powerful induction technique that can be applied to solve a wide variety of problems and, in particular, to induce classifiers. Furthermore, knowledge representation using grammars makes it possible to represent restrictions, complex structured objects and relations among objects or their components. A framework named GPSQL Miner was developed using this approach. It also exploits SQL features to deal with database management system (DBMS) that permits fast access to the data. GPSQL Miner automatically generates the input grammar from some users´ information, like goal and tables, stored in the DBMS. This grammar is used as bias for the evolution process. In order to validate this approach, the paper presents results of experiments performed on many databases. These experiments show that the proposed approach is robust, powerful, flexible and able of attaining good performance.
Keywords :
SQL; automatic programming; classification; data mining; genetic algorithms; grammars; knowledge representation; object-oriented programming; relational databases; software tools; GPSQL miner; SQL; automated data mining; complex structured objects; database management system; enterprise information system; grammatically-based genetic programming; induction technique; knowledge discovery; knowledge representation; relational databases; Axles; Computer science; Data analysis; Data mining; Delta modulation; Genetic programming; Humans; Information systems; Production; Relational databases;
Conference_Titel :
Chilean Computer Science Society, 2003. SCCC 2003. Proceedings. 23rd International Conference of the
Print_ISBN :
0-7695-2008-1
DOI :
10.1109/SCCC.2003.1245449