مرکز منطقه ای اطلاع رساني علوم و فناوري - An impossibility result for high dimensional supervised learning

DocumentCode :

659129

Title :

An impossibility result for high dimensional supervised learning

Author :

Rohban, Mohammad Hossein ; Ishwar, Prakash ; Orten, B. ; Karl, W.C. ; Saligrama, Venkatesh

Author_Institution :

ECE Dept., Boston Univ., Boston, MA, USA

fYear :

2013

fDate :

9-13 Sept. 2013

Firstpage :

Lastpage :

Abstract :

We study high-dimensional asymptotic performance limits of binary supervised classification problems where the class conditional densities are Gaussian with unknown means and covariances and the number of signal dimensions scales faster than the number of labeled training samples. We show that the Bayes error, namely the minimum attainable error probability with complete distributional knowledge and equally likely classes, can be arbitrarily close to zero and yet the limiting minimax error probability of every supervised learning algorithm is no better than a random coin toss. In contrast to related studies where the classification difficulty (Bayes error) is made to vanish, we hold it constant when taking high-dimensional limits. In contrast to VC-dimension based minimax lower bounds that consider the worst case error probability over all distributions that have a fixed Bayes error, our worst case is over the family of Gaussian distributions with constant Bayes error. We also show that a nontrivial asymptotic minimax error probability can only be attained for parametric subsets of zero measure (in a suitable measure space). These results expose the fundamental importance of prior knowledge and suggest that unless we impose strong structural constraints, such as sparsity, on the parametric space, supervised learning may be ineffective in high dimensional small sample settings.

Keywords :

Bayes methods; Gaussian distribution; learning (artificial intelligence); pattern classification; Bayes error; Gaussian distributions; VC-dimension based minimax lower bounds; binary supervised classification problems; class conditional densities; classification difficulty; distributional knowledge; high dimensional supervised learning; high-dimensional asymptotic performance limits; impossibility result; labeled training samples; parametric space; random coin toss; signal dimensions; structural constraints; worst case error probability; Error probability; Gaussian distribution; Maximum likelihood estimation; Measurement uncertainty; Sensors; Supervised learning; Training;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Information Theory Workshop (ITW), 2013 IEEE

Conference_Location :

Sevilla

Print_ISBN :

978-1-4799-1321-3

Type :

conf

DOI :

10.1109/ITW.2013.6691252

Filename :

6691252

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=659129