Title of article :
AGNES-SMOTE: An Oversampling Algorithm Based on Hierarchical Clustering and Improved SMOTE
Author/Authors :
Wang, Xin School of Computer Information Security - Guilin University of Electronic Technology, Guilin, China , Yang, Yue School of Computer Information Security - Guilin University of Electronic Technology, Guilin, China , Chen, Mingsong Beihai Campus - Guilin University of Electronic Technology, Beihai, China , Wang, Qin Beihai Campus - Guilin University of Electronic Technology, Beihai, China , Qin, Qin Beihai Campus - Guilin University of Electronic Technology, Beihai, China , Jiang, Hua School of Computer Information Security - Guilin University of Electronic Technology, Guilin, China , Wang, Huijiao School of Computer Information Security - Guilin University of Electronic Technology, Guilin, China
Pages :
9
From page :
1
To page :
9
Abstract :
Aiming at low classification accuracy of imbalanced datasets, an oversampling algorithm—AGNES-SMOTE (Agglomerative Nesting-Synthetic Minority Oversampling Technique) based on hierarchical clustering and improved SMOTE—is proposed. Its key procedures include hierarchically cluster majority samples and minority samples, respectively; divide minority subclusters on the basis of the obtained majority subclusters; select “seed sample” based on the sampling weight and probability distribution of minority subcluster; and restrict the generation of new samples in a certain area by centroid method in the sampling process. The combination of AGNES-SMOTE and SVM (Support Vector Machine) is presented to deal with imbalanced datasets classification. Experiments on UCI datasets are conducted to compare the performance of different algorithms mentioned in the literature. Experimental results indicate AGNES-SMOTE excels in synthesizing new samples and improves SVM classification performance on imbalanced datasets.
Keywords :
AGNES-SMOTE , SMOTE , Oversampling Algorithm , Hierarchical Clustering and Improved
Journal title :
Scientific Programming
Serial Year :
2020
Full Text URL :
Record number :
2610331
Link To Document :
بازگشت