• Title of article

    Data-based interval estimation of classification error rates

  • Author/Authors

    Krzanowski، W. J. نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2001
  • Pages
    -584
  • From page
    585
  • To page
    0
  • Abstract
    Leave-one-out and 632 bootstrap are popular data-based methods of estimating the true error rate of a classification rule, but practical applications almost exclusively quote only point estimates. Interval estimation would provide better assessment of the future performance of the rule, but little has been published on this topic. We first review general-purpose jackknife and bootstrap methodology that can be used in conjunction with leave-one-out estimates to provide prediction intervals for true error rates of classification rules. Monte Carlo simulation is then used to investigate coverage rates of the resulting intervals for normal data, but the results are disappointing; standard intervals show considerable overinclusion, intervals based on Edgeworth approximations or random weighting do not perform well, and while a bootstrap approach provides intervals with coverage rates closer to the nominal ones there is still marked underinclusion. We then turn to intervals constructed from 632 bootstrap estimates, and show that much better results are obtained. Although there is now some overinclusion, particularly for large training samples, the actual coverage rates are sufficiently close to the nominal rates for the method to be recommended. An application to real data illustrates the considerable variability that can arise in practical estimation of error rates.
  • Keywords
    A. Organic compounds , A. Superconductors , D. Phase transitions , D. Spin-density waves
  • Journal title
    JOURNAL OF APPLIED STATISTICS
  • Serial Year
    2001
  • Journal title
    JOURNAL OF APPLIED STATISTICS
  • Record number

    40715