DocumentCode :
3112850
Title :
Classification in High-Dimensional Feature Spaces: Random Subsample Ensemble
Author :
Serpen, Gursel ; Pathical, Santhosh
Author_Institution :
Electr. Eng. & Comput. Sci. Dept., Univ. of Toledo, Toledo, OH, USA
fYear :
2009
fDate :
13-15 Dec. 2009
Firstpage :
740
Lastpage :
745
Abstract :
This paper presents application of machine learning ensembles, which randomly project the original high dimensional feature space onto multiple lower dimensional feature subspaces, to classification problems with high-dimensional feature spaces. The motivation is to address challenges associated with algorithm scalability, data sparsity and information loss due to the so-called curse of dimensionality. The original high dimensional feature space is randomly projected onto a number of lower-dimensional feature subspaces. Each of these subspaces constitutes the domain of a classification subtask, and is associated with a base learner within an ensemble machine-learner context. Such an ensemble conceptualization is called as random subsample ensemble. Simulation results performed on data sets with up to 20,000 features indicate that the random subsample ensemble classifier performs comparably to other benchmark machine learners based on performance measures of prediction accuracy and cpu time. This finding establishes the feasibility of the ensemble and positions it to tackle classification problems with even much higher dimensional feature spaces.
Keywords :
data analysis; feature extraction; learning (artificial intelligence); pattern classification; algorithm scalability; base learner; benchmark machine learners; classification subtask domain; data sets; data sparsity; dimensionality curse; ensemble machine-learner context; high dimensional feature spaces; information loss; lower dimensional feature subspaces; machine learning ensembles; random subsample ensemble; Application software; Computational complexity; Computational efficiency; Computer science; Machine learning; Machine learning algorithms; Performance evaluation; Principal component analysis; Scalability; Space technology; curse of dimensionality; ensemble classification; high dimensional feature space; random subspace;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications, 2009. ICMLA '09. International Conference on
Conference_Location :
Miami Beach, FL
Print_ISBN :
978-0-7695-3926-3
Type :
conf
DOI :
10.1109/ICMLA.2009.26
Filename :
5381318
Link To Document :
بازگشت