DocumentCode
2314929
Title
Unified Strategy for Feature Selection and Data Imputation
Author
Bratu, Camelia Vidrighin ; Potolea, Rodica
Author_Institution
Comput. Sci. Dept., Tech. Univ. of Cluj-Napoca, Cluj-Napoca, Romania
fYear
2009
fDate
26-29 Sept. 2009
Firstpage
413
Lastpage
419
Abstract
Data-related issues represent the main causes for insufficient performance in data mining. Existing strategies for tackling these issues include procedures for handling incomplete data - mandatory in various schemes, and feature selection, both augmenting the learning process. Our previous work on data imputation has shown that a good imputation policy for strongly correlated attributes with the class can improve the learning accuracy. Moreover, feature selection also enhances the performance of an inducer. The focus of the paper is to validate the performance and stability of our combined methodology for pre-processing data. The novelty of the method resides in the combination of feature selection with data imputation, in order to obtain an improved version of the training set. The experimental results have shown that, when mining incomplete data, our combined pre-processing methodology boosts the accuracy of a classifier. Moreover, it is more successful than each of the individual steps it combines, feature selection and imputation, producing better or similar results.
Keywords
data handling; data mining; learning (artificial intelligence); pattern classification; data handling; data imputation; data mining; feature selection; learning process; Cleaning; Computer science; Data analysis; Data mining; Data preprocessing; Filtering; Humans; Performance evaluation; Scientific computing; Stability; classification; combined methodology; feature selection; imputation; pre-processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2009 11th International Symposium on
Conference_Location
Timisoara
Print_ISBN
978-1-4244-5910-0
Electronic_ISBN
978-1-4244-5911-7
Type
conf
DOI
10.1109/SYNASC.2009.53
Filename
5460822
Link To Document