Abstract :
We consider complexity penalization methods for model selection+ These methods
aim to choose a model to optimally trade off estimation and approximation
errors by minimizing the sum of an empirical risk term and a complexity penalty+
It is well known that if we use a bound on the maximal deviation between empirical
and true risks as a complexity penalty, then the risk of our choice is no more
than the approximation error plus twice the complexity penalty+ There are many
cases, however, where complexity penalties like this give loose upper bounds on
the estimation error+ In particular, if we choose a function from a suitably simple
convex function class with a strictly convex loss function, then the estimation
error ~the difference between the risk of the empirical risk minimizer and the minimal
risk in the class! approaches zero at a faster rate than the maximal deviation
between empirical and true risks+ In this paper, we address the question of whether
it is possible to design a complexity penalized model selection method for these
situations+ We show that, provided the sequence of models is ordered by inclusion,
in these cases we can use tight upper bounds on estimation error as a complexity
penalty+ Surprisingly, this is the case even in situations when the difference
between the empirical risk and true risk ~and indeed the error of any estimate of
the approximation error! decreases much more slowly than the complexity penalty+
We give an oracle inequality showing that the resulting model selection method
chooses a function with risk no more than the approximation error plus a constant
times the complexity penalty+