مرکز منطقه ای اطلاع رساني علوم و فناوري - A general model for finite-sample effects in training and testing of competing classifiers

DocumentCode :

838433

Title :

A general model for finite-sample effects in training and testing of competing classifiers

Author :

Beiden, Sergey V. ; Maloof, Marcus A. ; Wagner, Robert F.

Author_Institution :

Food & Drug Adm., Center for Devices & Radiol. Health, Rockville, MD, USA

Volume :

Issue :

fYear :

2003

Firstpage :

1561

Lastpage :

1569

Abstract :

The conventional wisdom in the field of statistical pattern recognition (SPR) is that the size of the finite test sample dominates the variance in the assessment of the performance of a classical or neural classifier. The present work shows that this result has only narrow applicability. In particular, when competing algorithms are compared, the finite training sample more commonly dominates this uncertainty. This general problem in SPR is analyzed using a formal structure recently developed for multivariate random-effects receiver operating characteristic (ROC) analysis. Monte Carlo trials within the general model are used to explore the detailed statistical structure of several representative problems in the subfield of computer-aided diagnosis in medicine. The scaling laws between variance of accuracy measures and number of training samples and number of test samples are investigated and found to be comparable to those discussed in the classic text of Fukunaga, but important interaction terms have been neglected by previous authors. Finally, the importance of the contribution of finite trainers to the uncertainties argues for some form of bootstrap analysis to sample that uncertainty. The leading contemporary candidate is an extension of the 0.632 bootstrap and associated error analysis, as opposed to the more commonly used cross-validation.

Keywords :

Monte Carlo methods; error analysis; learning (artificial intelligence); pattern recognition; sensitivity analysis; testing; Monte Carlo methods; bootstrap analysis; competing algorithms; computer-aided diagnosis; error analysis; finite sample effects; finite testing; medicine; multivariate random effects; neural classifier; receiver operating characteristic analysis; scaling laws; statistical pattern recognition; training; uncertainty; Computer aided diagnosis; Error analysis; Medical diagnostic imaging; Monte Carlo methods; Pattern recognition; Performance analysis; Sampling methods; Size measurement; Testing; Uncertainty;

fLanguage :

English

Journal_Title :

Pattern Analysis and Machine Intelligence, IEEE Transactions on

Publisher :

ieee

ISSN :

0162-8828

Type :

jour

DOI :

10.1109/TPAMI.2003.1251149

Filename :

1251149

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=838433