Title :
Random projections versus random selection of features for classification of high dimensional data
Author :
Mylavarapu, Sachin ; Kaban, Ata
Author_Institution :
Sch. of Comput. Sci., Univ. of Birmingham, Birmingham, UK
Abstract :
Random projections and random subspace methods are very simple and computationally efficient techniques to reduce dimensionality for learning from high dimensional data. Since high dimensional data tends to be prevalent in many domains, such techniques are the subject of much recent interest. Random projections (RP) are motivated by their proven ability to preserve inter-point distances. By contrary, the random selection of features (RF) appears to be a heuristic, which nevertheless exhibits good performance in previous studies. In this paper we conduct a thorough empirical comparison between these two approaches in a variety of data sets with different characteristics. We also extend our study to multi-class problems. We find that RP tends to perform better than RF in terms of the classification accuracy in small sample settings, although RF is surprisingly good as well in many cases.
Keywords :
data analysis; feature extraction; learning (artificial intelligence); pattern classification; random processes; classification accuracy; dimensionality reduction; high dimensional data classification; interpoint distance preservation; learning; multiclass problems; random feature selection; random projections; random subspace method; Accuracy; Classification algorithms; Principal component analysis; Radio frequency; Supervised learning; Support vector machines; Training; classification; dimensionality reduction; high dimensions; random projections; random subspace;
Conference_Titel :
Computational Intelligence (UKCI), 2013 13th UK Workshop on
Conference_Location :
Guildford
Print_ISBN :
978-1-4799-1566-8
DOI :
10.1109/UKCI.2013.6651321