DocumentCode
1800016
Title
Error signal distribution as an indicator of imbalanced data
Author
Furundzic, Drasko ; Stankovic, Stevan ; Dimic, Goran
Author_Institution
Mihajlo Pupin Inst., Belgrade, Serbia
fYear
2014
fDate
25-27 Nov. 2014
Firstpage
189
Lastpage
194
Abstract
This paper defines criteria for assessing the imbalance of datasets for training predictive learning models. The most important criterion for evaluating the imbalance is the distribution of the error signal over the space of local measure of distances between the points of the training set. In this paper is presented the analysis of this indicator for the sets of various distributions, and it has been shown that the most information potential for the case of the identical mapping of data sets from the real domain is incorporated within the data whose internal distribution is uniform.
Keywords
data handling; learning (artificial intelligence); statistical distributions; data sets; error signal distribution; imbalanced data; internal distribution; local measure; predictive learning models; training set; Approximation methods; Data mining; Data models; Electronic mail; Entropy; Predictive models; Training; Imbalanced data; imbalanced learning; predictive models;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Network Applications in Electrical Engineering (NEUREL), 2014 12th Symposium on
Conference_Location
Belgrade
Print_ISBN
978-1-4799-5887-0
Type
conf
DOI
10.1109/NEUREL.2014.7011503
Filename
7011503
Link To Document