Minimax rates of convergence for high-dimensional regression under ℓq-ball sparsity

Author

Raskutti, Garvesh ; Wainwright, Martin J. ; Bin Yu

Author_Institution

Dept. of Stat., UC Berkeley, Berkeley, CA, USA

fYear

2009

fDate

Sept. 30 2009-Oct. 2 2009

Firstpage

251

Lastpage

257

Abstract

Consider the standard linear regression model y = XÃŸ* + w, where y Â¿ Rⁿ is an observation vector, X Â¿ R^nÃ—d is a measurement matrix, ÃŸ* Â¿ R^d is the unknown regression vector, and w ~ N (0, Â¿² I) is additive Gaussian noise. This paper determines sharp minimax rates of convergence for estimation of ÃŸ* in Â¿₂ norm, assuming that ÃŸ* belongs to a weak Â¿_b-ball B_q(R_q) for some q Â¿ [0, 1]. We show that under suitable regularity conditions on the design matrix X, the minimax error in squared Â¿₂-norm scales as R_q((log d)/n)^{1 -q/2}. In addition, we provide lower bounds on rates of convergence for general Â¿_p norm (for all p Â¿ [1, + Â¿], p Â¿ q). Our proofs of the lower bounds are information-theoretic in nature, based on Fano´s inequality and results on the metric entropy of the balls B_q(R_q). Matching upper bounds are derived by direct analysis of the solution to an optimization algorithm over B_q(R_q). We prove that the conditions on X required by optimal algorithms are satisfied with high probability by broad classes of non-i.i.d. Gaussian random matrices, for which RIP or other sparse eigenvalue conditions are violated. For q = 0, Â¿₁-based methods (Lasso and Dantzig selector) achieve the minimax optimal rates in Â¿₂ error, but require stronger regularity conditions on the design than the non-convex optimization algorithm used to determine the minimax upper bounds.

Keywords

Gaussian noise; convergence; eigenvalues and eigenfunctions; information theory; matrix algebra; minimax techniques; probability; random processes; regression analysis; vectors; Dantzig selector; Lasso selector; RIP; additive Gaussian noise; convergence minimax rates; design matrix; high-dimensional regression; information-theoretic; non-i.i.d. Gaussian random matrices; optimization algorithm; probability; regression vector; sparse eigenvalue conditions; standard linear regression model; Â¿q-ball sparsity; Additive noise; Algorithm design and analysis; Convergence; Gaussian noise; Linear regression; Measurement standards; Minimax techniques; Noise measurement; Upper bound; Vectors;

fLanguage

English

Publisher

ieee

Conference_Titel

Communication, Control, and Computing, 2009. Allerton 2009. 47th Annual Allerton Conference on

Conference_Location

Monticello, IL

Print_ISBN

978-1-4244-5870-7

Type

conf

DOI

10.1109/ALLERTON.2009.5394804

Filename

5394804