Title :
A Hybrid Evolutionary Approach To Construct Optimal Decision Trees With Large Data Sets
Author :
Patil, D.V. ; Bichkar, R.S.
Author_Institution :
S.G.G.S. Inst. of Eng. & Tech. Nanded.M.S, Maharashtra
Abstract :
Data mining environments produces large volume of data. The large amount of knowledge contains can be utilized to improve decision-making process of an organization. Large amount of available data when used for decision tree construction builds large sized trees that are incomprehensible to human experts. The learning process on this high volume data becomes very slow, as it has to be done serially on available large datasets. Our ultimate goal is to build smaller trees with equally accurate solutions with randomly selected sampled data. We experimented on techniques based on the idea of incremental random sampling combined with genetic algorithms that uses global search techniques to evolve decision Trees to obtain compact representation of large data set. Experiments performed on some data sets proved that the proposed random sampling procedures with genetic algorithms to build decision Trees gives relatively smaller trees as compared to other methods but equally accurate solution as other methods. The method incorporates optimization with the comprehensibility and scalability. We tried to explore the method using that we can avoid problems like slow execution, overloading of memory and processor with very large database can be avoided using the technique.
Keywords :
data mining; decision making; decision trees; genetic algorithms; data mining; decision making; genetic algorithms; global search techniques; hybrid evolutionary approach; incremental random sampling; large data sets; optimal decision trees; optimization; Biological cells; Classification tree analysis; Data mining; Decision making; Decision trees; Genetic algorithms; Humans; Optimization methods; Sampling methods; Testing; Comprehensibility; Large data sets; classification accuracy; decision tree; genetic algorithm; genetically evolved decision Tree; training set size;
Conference_Titel :
Industrial Technology, 2006. ICIT 2006. IEEE International Conference on
Conference_Location :
Mumbai
Print_ISBN :
1-4244-0726-5
Electronic_ISBN :
1-4244-0726-5
DOI :
10.1109/ICIT.2006.372250