Title :
A Sequential Ensemble Classification (SEC) System for Tackling the Problem of Unbalance Learning: A Case Study
Author :
Sheikh-Nia, S. ; Grewal, Gary ; Areibi, Shawki
Author_Institution :
Sch. of Comput. Sci., Univ. of Guelph, Guelph, ON, Canada
Abstract :
In this paper we propose a Sequential Ensemble Classification (SEC) technique which is designed to tackle the problem of learning from a data set with an extremely unbalanced distribution of instances among the classes. This system employs a specific decomposition technique that reduces the degree of unbalance in the data by transforming multi-class problem into a sequence of binary class problems. We investigate two different implementations of the proposed method, one based on an ensemble of homogeneous classifiers and a second based on a heterogeneous ensemble of classifiers. A real-world medical data set has been chosen as a case study for the investigation of the proposed method. The data is highly unbalanced, consists of a wide range of class values, some of which contain only a few instances, and which is voluminous. Our experimental results show that both schemes of the SEC system are able to outperform standalone classifiers, with the highest performance being achieved by the homogeneous design of the system.
Keywords :
learning (artificial intelligence); medical computing; pattern classification; SEC system; binary class problems; class values; decomposition technique; extremely unbalanced distribution; heterogeneous classifier ensemble; homogeneous classifier ensemble; medical data set; multiclass problem; sequential ensemble classification system; unbalanced learning; Artificial neural networks; Boosting; Hospitals; Niobium; Testing; Training; decomposition technique; ensemble-based classification; multi-class unbalanced problem; unbalanced distribution;
Conference_Titel :
Machine Learning and Applications (ICMLA), 2012 11th International Conference on
Conference_Location :
Boca Raton, FL
Print_ISBN :
978-1-4673-4651-1
DOI :
10.1109/ICMLA.2012.154