DocumentCode :
2291831
Title :
SkewBoost: An algorithm for classifying imbalanced datasets
Author :
Hukerikar, Saumil ; Tumma, Ashwin ; Nikam, Akshay ; Attar, Vahida
Author_Institution :
Dept. of Comput. Eng. & Inf. Technol., Coll. of Eng. Pune, Pune, India
fYear :
2011
fDate :
15-17 Sept. 2011
Firstpage :
46
Lastpage :
52
Abstract :
Many real world data sets have an imbalanced distribution of the instances. Learning from such data sets results in the classifier being biased towards the majority class, thereby tending to misclassify the minority class samples. In this paper, we provide a technique, SkewBoost which classifies the minority instances correctly without compromising much on the correct classification of the majority instances. In the SkewBoost technique, minority and majority instances are identified during execution of the boosting algorithm. A variation of SMOTE is used to create synthetic minority instances which are then added to the training set and total weight is rebalanced. After each iteration of the boosting algorithm, the weight of each instance is modified to focus more on the misclassified instances. A cost-sensitive approach has been adopted to reweight the instances following every iteration. This method is evaluated, in terms of the F-measure, G-mean, AUC, Recall and Precision, on imbalanced data sets against the results that have been published in the previous publications of algorithms on imbalanced datasets.
Keywords :
data handling; iterative methods; learning (artificial intelligence); pattern classification; AUC; F-measure; G-mean; SMOTE variation; SkewBoost technique; boosting algorithm; cost-sensitive approach; imbalanced dataset classification algorithm; precision; recall; synthetic minority instances; Accuracy; Boosting; Classification algorithms; Data mining; Measurement; boosting; instance weights; minority class; over sampling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Communication Technology (ICCCT), 2011 2nd International Conference on
Conference_Location :
Allahabad
Print_ISBN :
978-1-4577-1385-9
Type :
conf
DOI :
10.1109/ICCCT.2011.6075185
Filename :
6075185
Link To Document :
بازگشت