Title :
Mining incomplete data with lost values and attribute-concept values
Author :
Clark, Patrick G. ; Grzymala-Busse, Jerzy W.
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of Kansas, Lawrence, KS, USA
Abstract :
This paper presents novel research on an experimental comparison of two interpretations of missing attribute values: lost values and attribute-concept values. Experiments were conducted on 176 data sets, with preprocessing using three kinds of probabilistic approximations (lower, middle and upper) and then the MLEM2 rule induction system. The performance was evaluated using the error rate computed by ten-fold cross validation. Our main objective was to check which interpretation of the two missing attribute values is better in terms of the error rate. In our experiments, the better performance, in 10 out of 24 cases, is accomplished using lost values. In remaining 14 cases the difference in performance is not statistically significant (5% significance level).
Keywords :
approximation theory; data mining; rough set theory; MLEM2 rule induction system; attribute-concept values; data mining; lost values; probabilistic approximations; rough set theory; Approximation methods; Education; Error analysis; Image segmentation; Iris recognition; Probabilistic logic; Rough sets;
Conference_Titel :
Granular Computing (GrC), 2014 IEEE International Conference on
Conference_Location :
Noboribetsu
DOI :
10.1109/GRC.2014.6982806