DocumentCode
1906375
Title
Safe level graph for synthetic minority over-sampling techniques
Author
Bunkhumpornpat, Chumphol ; Subpaiboonkit, Sitthichoke
Author_Institution
Dept. of Comput. Sci., Chiang Mai Univ., Chiang Mai, Thailand
fYear
2013
fDate
4-6 Sept. 2013
Firstpage
570
Lastpage
575
Abstract
In the class imbalance problem, most existent classifiers which are designed by the distribution of balance datasets fail to recognize minority classes since a large number of negative instances can dominate a few positive instances. Borderline-SMOTE and Safe-Level-SMOTE are over-sampling techniques which are applied to handle this situation by generating synthetic instances in different regions. The former operates on the border of a minority class while the latter works inside the class far from the border. Unfortunately, a data miner is unable to conveniently justify a suitable SMOTE for each dataset. In this paper, a safe level graph is proposed as a guideline tool for selecting an appropriate SMOTE and describes the characteristic of a minority class in an imbalance dataset. Relying on advice of a safe level graph, the experimental success rate is shown to reach 73% when an F-measure is used as the performance measure and 78% for satisfactory AUCs.
Keywords
data mining; graph theory; pattern classification; sampling methods; F-measure; balance dataset distribution; borderline-SMOTE; class imbalance problem; data miner; guideline tool; minority class border; safe level graph; safe-level-SMOTE; synthetic minority over-sampling techniques; Decision trees; Diabetes; Earth; Noise; Remote sensing; Satellites; Vectors; Borderline-SMOTE; Safe-Level-SMOTE; class imbalance problem; over-sampling; safe level graph;
fLanguage
English
Publisher
ieee
Conference_Titel
Communications and Information Technologies (ISCIT), 2013 13th International Symposium on
Conference_Location
Surat Thani
Print_ISBN
978-1-4673-5578-0
Type
conf
DOI
10.1109/ISCIT.2013.6645923
Filename
6645923
Link To Document