Title of article :
Rejecting unclassifiable samples with decision forests
Author/Authors :
Keefer، نويسنده , , Christopher E. and Woody، نويسنده , , Nathaniel A.، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2006
Pages :
6
From page :
40
To page :
45
Abstract :
Validation of empirical models is designed to produce statistics related to the average error rate of the model. These statistics can be used to minimize errors arising from extrapolation in the Y-values, but pay no attention to the X-block of predicted samples and cannot provide sample specific prediction confidences. In this manuscript, a novel method for identifying potentially poorly classified samples is described that is universal to any Decision Forest method. The samples identified as unclassifiable are assigned a “no-class” assignment and it is shown that these samples have a much higher error rate than samples assigned to a class. These samples are identified by creating a proximity matrix that calculates the similarity of each test sample to each training sample. This similarity is defined in terms of the path samples took through the tree and can be used as a transformed descriptor set for a k-nearest neighbor classifier. The Decision Forest prediction and the k-nearest neighbor prediction can then be combined to assign the sample prediction in such a way that the expected error of the prediction is more accurate. The method is purely automatic and does not require any parameters beyond the determination of k.
Keywords :
“no-class” , Decision forest (DF) method , Proximity matrix , K-nearest neighbor
Journal title :
Chemometrics and Intelligent Laboratory Systems
Serial Year :
2006
Journal title :
Chemometrics and Intelligent Laboratory Systems
Record number :
1461725
Link To Document :
بازگشت