DocumentCode
1333943
Title
Developing New Fitness Functions in Genetic Programming for Classification With Unbalanced Data
Author
Bhowan, Urvesh ; Johnston, Mark ; Zhang, Mengjie
Author_Institution
Sch. of Eng. & Comput. Eng., Victoria Univ. of Wellington, Wellington, New Zealand
Volume
42
Issue
2
fYear
2012
fDate
4/1/2012 12:00:00 AM
Firstpage
406
Lastpage
421
Abstract
Machine learning algorithms such as genetic programming (GP) can evolve biased classifiers when data sets are unbalanced. Data sets are unbalanced when at least one class is represented by only a small number of training examples (called the minority class) while other classes make up the majority. In this scenario, classifiers can have good accuracy on the majority class but very poor accuracy on the minority class(es) due to the influence that the larger majority class has on traditional training criteria in the fitness function. This paper aims to both highlight the limitations of the current GP approaches in this area and develop several new fitness functions for binary classification with unbalanced data. Using a range of real-world classification problems with class imbalance, we empirically show that these new fitness functions evolve classifiers with good performance on both the minority and majority classes. Our approaches use the original unbalanced training data in the GP learning process, without the need to artificially balance the training examples from the two classes (e.g., via sampling).
Keywords
data handling; genetic algorithms; learning (artificial intelligence); pattern classification; GP learning process; biased classifiers; binary classification; class imbalance; data sets; fitness functions; genetic programming; machine learning algorithms; majority class; minority class; training criteria; unbalanced data; unbalanced training data; Accuracy; Feature extraction; Genetics; Loss measurement; Machine learning; Machine learning algorithms; Training; Classification; fitness function; genetic programming (GP); unbalanced data;
fLanguage
English
Journal_Title
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
Publisher
ieee
ISSN
1083-4419
Type
jour
DOI
10.1109/TSMCB.2011.2167144
Filename
6029340
Link To Document