Title :
On the behavior of feature selection methods dealing with noise and relevance over synthetic scenarios
Author :
Bolón-Canedo, V. ; Sánchez-Maro, N. ; Alonso-Betanzos, A.
Author_Institution :
Dept. of Comput. Sci., Univ. of A Coruna, A Coruna, Spain
fDate :
July 31 2011-Aug. 5 2011
Abstract :
Adequate identification of relevant features is fundamental in real world scenarios. The problem is specially important when the datasets have a much larger number of features than samples. However, in most cases, the relevant features in real datasets are unknown. In this paper several synthetic datasets are employed to test the effectiveness of different feature selection methods over different artificial classification scenarios, such as altered features (noise), presence of a crescent number of irrelevant features and a small ratio between number of samples and number of features. Six filters and two embedded methods are tested over five synthetic datasets, so as to be able to choose a robust and noise tolerant method, paving the way for its application to real datasets in the classification domain.
Keywords :
pattern classification; artificial classification scenarios; classification domain; embedded method; feature selection method; noise tolerant method; real datasets; real world scenarios; synthetic datasets; synthetic scenario; Correlation; Feature extraction; Light emitting diodes; Machine learning; Noise; Redundancy; Training;
Conference_Titel :
Neural Networks (IJCNN), The 2011 International Joint Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
978-1-4244-9635-8
DOI :
10.1109/IJCNN.2011.6033406