Title :
Hybridization of Base Classifiers of Random Subsample Ensembles for Enhanced Performance in High Dimensional Feature Spaces
Author :
Pathical, Santhosh ; Serpen, Gursel
Author_Institution :
Electr. Eng. & Comput. Sci. Dept., Univ. of Toledo, Toledo, OH, USA
Abstract :
This paper presents a simulation-based empirical study of the performance profile of random sub sample ensembles with a hybrid mix of base learner composition in high dimensional feature spaces. The performance of hybrid random sub sample ensemble that uses a combination of C4.5, k-nearest neighbor (kNN) and naïve Bayes base learners is assessed through statistical testing in comparison to those of homogeneous random sub sample ensembles that employ only one type of base learner. Simulation study employs five datasets with up to 20K features from the UCI Machine Learning Repository. Random sub sampling without replacement is used to map the original high dimensional feature space of the five datasets to a multiplicity of lower dimensional feature subspaces. The simulation study explores the effect of certain design parameters that include the count of base classifiers and sub sampling rate on the performance of the hybrid random subspace ensemble. The ensemble architecture utilizes the voting combiner in all cases. Simulation results indicate that hybridization of base learners for random sub sample ensemble improves the prediction accuracy rates and projects a more robust performance.
Keywords :
Bayes methods; learning (artificial intelligence); pattern classification; statistical testing; C4.5; UCI machine learning repository; base classifier hybridization; base learner composition; ensemble architecture; high dimensional feature spaces; k-nearest neighbor; naïve Bayes base learners; random subsample ensembles; statistical testing; Accuracy; Classification algorithms; Internet; Machine learning; Machine learning algorithms; Robustness; Training; curse of dimensionality; ensemble selection; heterogeneous diversity; hybrid ensembles; random subsampling; random subspace;
Conference_Titel :
Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on
Conference_Location :
Washington, DC
Print_ISBN :
978-1-4244-9211-4
DOI :
10.1109/ICMLA.2010.118