Learning Convex Concepts from Gaussian Distributions with PCA

Author

Vempala, Santosh S.

Author_Institution

Sch. of Comput. Sci., Georgia Tech, Atlanta, GA, USA

fYear

2010

Firstpage

124

Lastpage

130

Abstract

We present a new algorithm for learning a convex set in n-dimensional space given labeled examples drawn from any Gaussian distribution. The complexity of the algorithm is bounded by a fixed polynomial in n times a function of k and ϵ where k is the dimension of the normal subspace (the span of normal vectors to supporting hyperplanes of the convex set) and the output is a hypothesis that correctly classifies at least 1 - ϵ of the unknown Gaussian distribution. For the important case when the convex set is the intersection of k halfspaces, the complexity is poly(n, k, 1/ϵ) + n · min k(O(log k/ϵ⁴)), (k/ϵ)^O(k), improving substantially on the state of the art [Vem04], [KOS08] for Gaussian distributions. The key step of the algorithm is a Singular Value Decomposition after applying a normalization. The proof is based on a monotonicity property of Gaussian space under convex restrictions.

Keywords

Gaussian distribution; computational complexity; learning (artificial intelligence); principal component analysis; set theory; singular value decomposition; Gaussian distribution; Gaussian space; PCA; convex restriction; convex set; learning; monotonicity property; polynomial complexity; singular value decomposition; Accuracy; Algorithm design and analysis; Complexity theory; Covariance matrix; Gaussian distribution; Polynomials; Principal component analysis; Gaussians; High-dimensional learning; PCA; convex; polynomial time;

fLanguage

English

Publisher

ieee

Conference_Titel

Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on

Conference_Location

Las Vegas, NV

ISSN

0272-5428

Print_ISBN

978-1-4244-8525-3

Type

conf

DOI

10.1109/FOCS.2010.19

Filename

5670814