Title :
A Novel Differential Evolution-Clustering Hybrid Resampling Algorithm on Imbalanced Datasets
Author :
Chen, Leichen ; Cai, Zhihua ; Chen, Lu ; Gu, Qiong
Author_Institution :
Sch. of Comput., China Univ. of Geosci., Wuhan, China
Abstract :
When dealing with the imbalanced datasets (IDS), the hyperplane of Support vector machine (SVM) tends to minority class (positive class), which causes low classification accuracy. Aiming at this problem, we propose a novel differential evolution-clustering hybrid resampling SVM algorithm (DEC-SVM). This algorithm utilizes the similar mutation and crossover operators of Differential Evolution (DE) for over-sampling to enlarge the ratio of positive samples, and then we apply clustering to the over-sampled training dataset as a data cleaning method for both classes, removing the redundant or noisy samples. Experimental results show that our method DEC-SVM performs better, compared with standard SVM, SMOTE-SVM and DE-SVM under the criterion of F-measure and ROC Area (AUC) upon ten different UCI standard datasets.
Keywords :
pattern clustering; sampling methods; support vector machines; F-measure criterion; ROC area criterion; clustering algorithm; crossover operators; data cleaning method; differential evolution; hybrid resampling algorithm; imbalanced datasets; minority class; mutation operators; support vector machine; Cleaning; Clustering algorithms; Data mining; Electronic mail; Geology; Intrusion detection; Learning systems; Signal to noise ratio; Support vector machine classification; Support vector machines; clustering; differential evolution; hybrid resampling; imbalanced datasets; support vector machine;
Conference_Titel :
Knowledge Discovery and Data Mining, 2010. WKDD '10. Third International Conference on
Conference_Location :
Phuket
Print_ISBN :
978-1-4244-5397-9
Electronic_ISBN :
978-1-4244-5398-6
DOI :
10.1109/WKDD.2010.48