Author/Authors :
Ayd?n Ula?، نويسنده , , Olcay Taner Y?ld?z، نويسنده , , Ethem Alpaydin، نويسنده ,
Abstract :
In practice, classifiers in an ensemble are not independent. This paper is the continuation of our previous work on ensemble subset selection [A. Ulaş, M. Semerci, O.T. Yıldız, E. Alpaydın, Incremental construction of classifier and discriminant ensembles, Information Sciences, 179 (9) (2009) 1298–1318] and has two parts: first, we investigate the effect of four factors on correlation: (i) algorithms used for training, (ii) hyperparameters of the algorithms, (iii) resampled training sets, (iv) input feature subsets. Simulations using 14 classifiers on 38 data sets indicate that hyperparameters and overlapping training sets have higher effect on positive correlation than features and algorithms. Second, we propose postprocessing before fusing using principal component analysis (PCA) to form uncorrelated eigenclassifiers from a set of correlated experts. Combining the information from all classifiers may be better than subset selection where some base classifiers are pruned before combination, because using all allows redundancy.