DocumentCode :
2386709
Title :
Addressing Missing Attributes during Data Mining Using Frequent Itemsets and Rough Set Based Predictions
Author :
Li, Jiye ; Cercone, Nick ; Cohen, Robin
Author_Institution :
York Univ., Toronto
fYear :
2007
fDate :
2-4 Nov. 2007
Firstpage :
294
Lastpage :
294
Abstract :
In this paper, we present an improved method for predicting missing attribute values in data sets. We make use of frequent itemsets, generated from the association rules algorithm, displaying the correlations between different items in a set of transactions. In particular, we consider a database as a set of transactions and each data instance as an itemset. Then frequent itemsets can be used as a knowledge base to predict missing attribute values. Our approach integrates the RSFit method based on rough sets theory that produces faster predictions by considering similarities of attribute value pairs, but only for those attributes contained in the core or reduct of the data set. Using empirical studies on UCI and other real world data sets, we demonstrate a significant increase in prediction accuracy obtained from our new integrated approach, referred to as ItemRSFit.
Keywords :
data mining; rough set theory; ItemRSFit; RSFit method; association rules algorithm; data mining; frequent itemsets; knowledge base; missing attribute value prediction; missing attributes; rough set based predictions; rough sets theory; Accuracy; Association rules; Data mining; Data preprocessing; Data privacy; Design for experiments; Itemsets; Rough sets; Testing; Transaction databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Granular Computing, 2007. GRC 2007. IEEE International Conference on
Conference_Location :
Fremont, CA
Print_ISBN :
978-0-7695-3032-1
Type :
conf
DOI :
10.1109/GrC.2007.144
Filename :
4403113
Link To Document :
بازگشت