مرکز منطقه ای اطلاع رساني علوم و فناوري - On Classification Confidence and Ranking Using Decision Trees

DocumentCode :

3184295

Title :

On Classification Confidence and Ranking Using Decision Trees

Author :

Tóth, Norbert ; Pataki, Béla

Author_Institution :

Budapest Univ. of Technol. & Econ., Budapest

fYear :

2007

fDate :

June 29 2007-July 2 2007

Firstpage :

133

Lastpage :

138

Abstract :

In this paper a novel method is proposed that extends the decision tree framework, allowing standard decision tree classifiers to provide a unique certainty value for every input sample they classify. This value is calculated for every input sample individually and represents the classifier\´s certainty in the classification.The algorithm consists of three main parts. 1) The input sample\´s distance is calculated to the decision boundary. This step involves solving a set of linearly constrained quadratic programs. The distance calculating procedure also allows the use of different distance metrics, where the minimal distance projection is not necessarily invariant. 2) Kernel density estimation is done on the distance values of a training set to obtain conditional true and false classification profiles. 3) Using the conditional densities Bayesian computation is applied to calculate the conditional true classification probability, which we use as classification certainty. The algorithm proposed in this paper is not limited to axis parallel trees, it can be applied to any kind of decision tree where the decisions are hyperplanes (not necessarily parallel to the axes). The algorithm does not alter the tree structure, the growth process is not modified. It only uses the training data to obtain true and false classification profiles conditional to distance from the decision boundary. The usability of the method is demonstrated on two examples. One artificial two dimensional dataset, and one real world nine dimensional dataset. It is shown that the method can significantly increase the classification accuracy (in the cost of rejecting a certain number of samples, saying their classification would be too "risky"). It is also demonstrated that the classification certainty value can be effectively used for ranking purposes.

Keywords :

Bayes methods; decision trees; quadratic programming; Bayesian computation; artificial two dimensional dataset; axis parallel trees; classification certainty; classification confidence; classification profiles; conditional densities; conditional true classification probability; decision trees; distance metrics; kernel density estimation; linearly constrained quadratic programs; real world nine dimensional dataset; Bayesian methods; Classification algorithms; Classification tree analysis; Costs; Decision trees; Kernel; Probability; Training data; Tree data structures; Usability;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Engineering Systems, 2007. INES 2007. 11th International Conference on

Conference_Location :

Budapest

Print_ISBN :

1-4244-1147-5

Electronic_ISBN :

1-4244-1148-3

Type :

conf

DOI :

10.1109/INES.2007.4283686

Filename :

4283686

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3184295