Title :
Improving classification accuracy through feature selection
Author :
Bratu, Camelia Vidrighin ; Muresan, Tudor ; Potolea, Rodica
Author_Institution :
Tech. Univ. of Cluj-Napoca, Cluj-Napoca
Abstract :
High accuracy is essential to any data mining process. A large part of the factors which influence the success of a data mining problem reside in the quality of the data used. Feature selection represents one of the tools which can refine a dataset before presenting it to a learning scheme. This paper analyzes a wrapper approach for feature selection, with the purpose of boosting the classification accuracy. A wrapper is viewed as a 3-tuple consisting of a generation procedure, an evaluation function and a validation procedure. Experimental evaluations have been performed for several combinations of the three components. The results have shown that feature selection improves the classification accuracy and speeds up the training process. Moreover, two robust combinations are proposed: one that constantly achieves highest accuracy, and one which significantly boosts the initial accuracy of the inducer.
Keywords :
data mining; classification accuracy; data mining process; feature selection; Bayesian methods; Best practices; Boosting; Data mining; Data preprocessing; Decision trees; Diversity reception; Performance evaluation; Robustness; Wrapping;
Conference_Titel :
Intelligent Computer Communication and Processing, 2008. ICCP 2008. 4th International Conference on
Conference_Location :
Cluj-Napoca
Print_ISBN :
978-1-4244-2673-7
DOI :
10.1109/ICCP.2008.4648350