DocumentCode :
3388148
Title :
Is There Correlation Between the Estimated and True Classification Errors in Small-Sample Settings?
Author :
Hanczar, Blaise ; Hua, B.Jianping ; Dougherty, Edward R.
Author_Institution :
Department of Electrical and Computer Engineering, Texas A&M University, College Station, USA
fYear :
2007
fDate :
26-29 Aug. 2007
Firstpage :
16
Lastpage :
20
Abstract :
The validity of a classifier model, consisting of a trained classifier and it estimated error, depends upon the relationship between the estimated and true errors of the classifier. Absent a good error estimation rule, the classifier-error model lacks scientific meaning. This paper demonstrates that in high-dimensionality feature selection settings in the context of small samples there can be virtually no correlation between the true and estimated errors. This conclusion has serious ramifications in the domain of high-throughput genomic classification, such as gene-expression classification, where the number of potential features (gene expressions) is usually in the tens of thousands and the number of sample points (microarrays) is often under one hundred.
Keywords :
Bioinformatics; Biological system modeling; Computational biology; Computer errors; Error analysis; Gene expression; Genomics; Process design; Random variables; Sampling methods; classification; error estimation; small-sample;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Statistical Signal Processing, 2007. SSP '07. IEEE/SP 14th Workshop on
Conference_Location :
Madison, WI, USA
Print_ISBN :
978-1-4244-1198-6
Electronic_ISBN :
978-1-4244-1198-6
Type :
conf
DOI :
10.1109/SSP.2007.4301209
Filename :
4301209
Link To Document :
بازگشت