DocumentCode :
3618190
Title :
Transduction and typicalness for quality assessment of individual classifications in machine learning and data mining
Author :
M. Kukar
Author_Institution :
Fac. of Comput. & Inf. Sci., Ljubljana Univ., Slovenia
fYear :
2004
fDate :
6/26/1905 12:00:00 AM
Firstpage :
146
Lastpage :
153
Abstract :
In the past, machine learning algorithms have been successfully used in many problems, and are emerging as valuable data analysis tools. However, their serious practical use is affected by the fact, that more often than not, they cannot produce reliable and unbiased assessments of their predictions´ quality. In last years, several approaches for estimating reliability or confidence of individual classifiers have emerged, many of them building upon the algorithmic theory of randomness, such as (historically ordered) transduction-based confidence estimation, typicalness-based confidence estimation, and transductive reliability estimation. Unfortunately, they all have weaknesses: either they are tightly bound with particular learning algorithms, or the interpretation of reliability estimations is not always consistent with statistical confidence levels. In the paper, we propose a joint approach that compensates the mentioned weaknesses by integrating typicalness-based confidence estimation and transductive reliability estimation into joint confidence machine. The resulting confidence machine produces confidence values in the statistical sense (e.g., a confidence level of 95% means that in 95% the predicted class is also a true class), as well as provides us with a general principle that is independent of to the particular underlying classifier. We perform a series of tests with several different machine learning algorithms in several problem domains. We compare our results with that of a proprietary TCM-NN method as well as with kernel density estimation. We show that the proposed method significantly outperforms density estimation methods, and how it may be used to improve their performance.
Keywords :
"Quality assessment","Machine learning","Data mining","Machine learning algorithms","Classification tree analysis","Reliability theory","Kernel","Bayesian methods","Neural networks","Probability distribution"
Publisher :
ieee
Conference_Titel :
Data Mining, 2004. ICDM ´04. Fourth IEEE International Conference on
Print_ISBN :
0-7695-2142-8
Type :
conf
DOI :
10.1109/ICDM.2004.10089
Filename :
1410278
Link To Document :
بازگشت