DocumentCode :
1525225
Title :
Simultaneous Support Recovery in High Dimensions: Benefits and Perils of Block \\ell _{1}/\\ell _{\\infty } -Regularization
Author :
Negahban, Sahand N. ; Wainwright, Martin J.
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of California, Berkeley, CA, USA
Volume :
57
Issue :
6
fYear :
2011
fDate :
6/1/2011 12:00:00 AM
Firstpage :
3841
Lastpage :
3863
Abstract :
Given a collection of r ≥ 2 linear regression problems in p dimensions, suppose that the regression coefficients share partially common supports of size at most s. This set-up suggests the use of ℓ1/ℓ-regularized regression for joint estimation of the p×r matrix of regression coefficients. We analyze the high-dimensional scaling of ℓ1/ℓ-regularized quadratic programming, considering both consistency rates in ℓ-norm, and how the minimal sample size n required for consistent variable selection scales with model dimension, sparsity, and overlap between the supports. We first establish bounds on the ℓ-error as well sufficient conditions for exact variable selection for fixed design matrices, as well as for designs drawn randomly from general Gaussian distributions. Specializing to the case r = 2 linear regression problems with standard Gaussian designs whose supports overlap in a fraction α ∈ [0,1] of their entries, we prove that ℓ1/ℓ-regularized method undergoes a phase transition characterized by the rescaled sample size θ1,∞(n, p, s, α) = n/{(4 - 3 α) s log(p-(2- α) s)}. An implication is that the use of ℓ1/ℓ-regularization yields improved statistical efficiency if the overlap parameter is large enough ( α >; 2/3), but has worse statistical efficiency than a naive Lasso-based approach for moderate to small overlap (α <; 2/3 ). Empirical simulations illustrate the close agreement between theory and actual behavior in practice. These results show that caution must be exercised in applying 1/ℓ∞ block regularization: if the data does not match its structure very closely, it can impair statistical performance relative to computationally less expensive schemes.
Keywords :
Gaussian distribution; quadratic programming; regression analysis; Gaussian distributions; block ℓ1/ℓ-regularization; linear regression; quadratic programming; simultaneous support recovery; Estimation; Input variables; Joints; Linear regression; Multivariate regression; Noise; Symmetric matrices; $ell _{1}$-constraints; compressed sensing; convex relaxation; group Lasso; high-dimensional inference; model selection; phase transitions; sparse approximation; subset selection;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/TIT.2011.2144150
Filename :
5773043
Link To Document :
بازگشت