• DocumentCode
    1815045
  • Title

    Non-parametric Regression and Random Balance Method Modification for Determination of the Most Informative Features

  • Author

    Zablotskaya, Kseniya ; Ahmed, Mumtaz ; Zablotskiy, Sergey ; Minker, Wolfgang

  • Author_Institution
    Inst. of Inf. Technol., Univ. of Ulm, Ulm, Germany
  • fYear
    2010
  • fDate
    19-21 July 2010
  • Firstpage
    64
  • Lastpage
    67
  • Abstract
    In this paper we present a new method which allows us to detect the most informative features out of all data extracted from a certain data corpus. Widely used Pearson´s coefficient is not reliable if the dependency between extracted features (input variables) and the objective function (output) is not linear. This approach is based on a modified random balance method (RBM) combined with non-parametric kernel regression for modeling the dependency between output and input variables. The standard random balance method stochastically determines the most important features of a process, but it requires the values of the objective function at the certain assigned points. If there is no possibility to calculate these values, it is necessary to approximate them. Since we assume that the dependency between stochastic variables can be non-linear, it is necessary to take an appropriate model. We used non-parametric kernel regression because knowledge about the parametric structure of the dependency is not needed. Moreover, we modified the random balance method to handle the non-linearity of the data.
  • Keywords
    data mining; feature extraction; random processes; regression analysis; stochastic processes; dependency; feature extraction; most informative features; nonparametric kernel regression; nonparametric regression method; random balance method; stochastic process; stochastic variables; Correlation; Data mining; Data models; Estimation; Feature extraction; Input variables; Kernel;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Environments (IE), 2010 Sixth International Conference on
  • Conference_Location
    Kuala Lumpur
  • Print_ISBN
    978-1-4244-7836-1
  • Electronic_ISBN
    978-0-7695-4149-5
  • Type

    conf

  • DOI
    10.1109/IE.2010.19
  • Filename
    5673684