Title :
Sharing classifiers among ensembles from related problem domains
Author :
Zhang, Yi ; Street, W. Nick ; Burer, Samuel
Author_Institution :
Dept. of Manage. Sci., Iowa Univ., IA, USA
Abstract :
A classification ensemble is a group of classifiers that all solve the same prediction problem in different ways. It is well-known that combining the predictions of classifiers within the same problem domain using techniques like bagging or boosting often improves the performance. This research shows that sharing classifiers among different but closely related problem domains can also be helpful. In addition, a semi-definite programming based ensemble pruning method is implemented in order to optimize the selection of a subset of classifiers for each problem domain. Computational results on a catalog dataset indicate that the ensembles resulting from sharing classifiers among different product categories generally have larger AUCs than those ensembles trained only on their own categories. The pruning algorithm not only prevents the occasional decrease of effectiveness caused by conflicting concepts among the problem domains, but also provides a better understanding of the problem domains and their relationships.
Keywords :
pattern classification; classification ensemble; ensemble pruning; prediction problem; semidefinite programming; Bagging; Boosting; Cities and towns; Classification tree analysis; Credit cards; Data mining; Decision trees; Neural networks; Optimization methods; Voting;
Conference_Titel :
Data Mining, Fifth IEEE International Conference on
Print_ISBN :
0-7695-2278-5
DOI :
10.1109/ICDM.2005.131