Title :
Non-parametric Regression and Random Balance Method Modification for Determination of the Most Informative Features
Author :
Zablotskaya, Kseniya ; Ahmed, Mumtaz ; Zablotskiy, Sergey ; Minker, Wolfgang
Author_Institution :
Inst. of Inf. Technol., Univ. of Ulm, Ulm, Germany
Abstract :
In this paper we present a new method which allows us to detect the most informative features out of all data extracted from a certain data corpus. Widely used Pearson´s coefficient is not reliable if the dependency between extracted features (input variables) and the objective function (output) is not linear. This approach is based on a modified random balance method (RBM) combined with non-parametric kernel regression for modeling the dependency between output and input variables. The standard random balance method stochastically determines the most important features of a process, but it requires the values of the objective function at the certain assigned points. If there is no possibility to calculate these values, it is necessary to approximate them. Since we assume that the dependency between stochastic variables can be non-linear, it is necessary to take an appropriate model. We used non-parametric kernel regression because knowledge about the parametric structure of the dependency is not needed. Moreover, we modified the random balance method to handle the non-linearity of the data.
Keywords :
data mining; feature extraction; random processes; regression analysis; stochastic processes; dependency; feature extraction; most informative features; nonparametric kernel regression; nonparametric regression method; random balance method; stochastic process; stochastic variables; Correlation; Data mining; Data models; Estimation; Feature extraction; Input variables; Kernel;
Conference_Titel :
Intelligent Environments (IE), 2010 Sixth International Conference on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-4244-7836-1
Electronic_ISBN :
978-0-7695-4149-5