DocumentCode
1797733
Title
Imputation of missing data supported by Complete p-Partite attribute-based Decision Graphs
Author
Bertini, J.R. ; do Carmo Nicoletti, Maria ; Liang Zhao
Author_Institution
Comput. Sci. Dept., Univ. of Sao Paulo, Sao Paulo, Brazil
fYear
2014
fDate
6-11 July 2014
Firstpage
1100
Lastpage
1106
Abstract
Missing attribute values is a recurrent problem in data mining and machine learning. Although there are plenty of techniques to handle this problem, most of them are too simplistic to provide a good estimation for absent attribute values. A very active research area focuses on solving the missing attribute value problem via imputation methods, which replaces missing data with substituted values. This paper proposes a new imputation method which uses a special graph named Complete p-Partite Attribute-based Decision Graphs (CpP-AbDG) to estimate, in a consistent and plausible way, the missing values. The graph is built by considering the range of each attribute that describes the data divided into sub-intervals; sub-intervals are approached as the vertices of a graph. Edges are then established between pairs of different vertices, provided they do not related to the same attribute. The edges and vertices are finally assigned a weight, based on distributions of the classes. The resulting CpP-AbDG has shown to be a suitable and informative data structure for finding the proper interval in which a missing attribute value should lie, taking into account all the attributes that describe the data. Results comparing the proposed approach to classical ones in an computational environment that considers classification problems as an evaluation criteria, show the potential of the method.
Keywords
data mining; graph theory; learning (artificial intelligence); CpP-AbDG; complete p-partite attribute-based decision graphs; data mining; data structure; machine learning; missing attribute values; Algorithm design and analysis; Data models; Educational institutions; Electronic mail; Machine learning algorithms; Training; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks (IJCNN), 2014 International Joint Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4799-6627-1
Type
conf
DOI
10.1109/IJCNN.2014.6889593
Filename
6889593
Link To Document