Abstract :
This paper considers the problem of predicting binary choices by selecting from a
possibly large set of candidate explanatory variables, which can include both exogenous
variables and lagged dependent variables. We consider risk minimization with
the risk function being the predictive classification error. We study the convergence
rates of empirical risk minimization in both the frequentist and Bayesian approaches.
The Bayesian treatment uses a Gibbs posterior constructed directly from the empirical
risk instead of using the usual likelihood-based posterior. Therefore these approaches
do not require a correctly specified probability model. We show that the
proposed methods have near optimal performance relative to a class of linear classification
rules with selected variables. Such results in classification are obtained in a
framework of dependent data with strong mixing.