• DocumentCode
    2770519
  • Title

    Breakdown Point of Model Selection When the Number of Variables Exceeds the Number of Observations

  • Author

    Donoho, David ; Stodden, Victoria

  • Author_Institution
    Stanford Univ., Stanford
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    1916
  • Lastpage
    1921
  • Abstract
    The classical multivariate linear regression problem assumes p variables X1, X2,... ,Xp and a response vector y, each with n observations, and a linear relationship between the two: y = Xbeta + z, where z ~ N(0, sigma2). We point out that when p > n, there is a breakdown point for standard model selection schemes, such that model selection only works well below a certain critical complexity level depending on n/p. We apply this notion to some standard model selection algorithms (Forward Stepwise, LASSO, LARS) in the case where pGtn. We find that 1) the breakdown point is well-de ned for random X-models and low noise, 2) increasing noise shifts the breakdown point to lower levels of sparsity, and reduces the model recovery ability of the algorithm in a systematic way, and 3) below breakdown, the size of coef cient errors follows the theoretical error distribution for the classical linear model.
  • Keywords
    regression analysis; model recovery ability; model selection breakdown point; multivariate linear regression problem; response vector; Electric breakdown; Equations; Linear regression; Noise level; Noise reduction; Predictive models; Signal processing; Signal processing algorithms; Statistics; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2006. IJCNN '06. International Joint Conference on
  • Conference_Location
    Vancouver, BC
  • Print_ISBN
    0-7803-9490-9
  • Type

    conf

  • DOI
    10.1109/IJCNN.2006.246934
  • Filename
    1716344