Title :
A knowledge discovery using decision tree by Gini coefficient
Author :
Sundhari, S. Sivagama
Author_Institution :
Fac. of Eng. & Inf. Technol., Kuala Lumpur Metropolitan Univ. Coll., Kuala Lumpur, Malaysia
Abstract :
Decision trees have been found very effective for classification especially in Data Mining. Knowledge Discovery (KD) is an active and important research area with the promise for a high payoff in many business and scientific applications. One of the main tasks in KD is classification. A particular efficient method for classification is decision tree. The selection of the attribute used at each node of the tree to split the data is crucial in order to correctly classify objects. A split in a decision tree corresponds to the predictor with the maximum separating power. In other words, the best split does the best job in creating nodes where a single class dominates. There are several methods of calculating the predictor´s power to separate data. One of the best known methods is based on the Gini coefficient of inequality. In this paper we introduce a formal description which allows us to compare splits selected by Gini coefficient and splits selected by guesswork. The accuracy of knowledge discovered from Gini coefficient approach was even better compared to the splits selected by guess work.
Keywords :
data mining; decision trees; pattern classification; Gini coefficient; attribute selection; data mining; decision tree; formal description; knowledge discovery; object classification; split selection; Business; Data mining; Decision trees; Histograms; Impurities; Indexes; C4.5; CART; Gini index or Gini coefficient; ID3; KD; SLIQ; SPRINT;
Conference_Titel :
Business, Engineering and Industrial Applications (ICBEIA), 2011 International Conference on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-4577-1279-1
DOI :
10.1109/ICBEIA.2011.5994250