• DocumentCode
    2705613
  • Title

    DensityRank: A novel feature ranking method based on kernel estimation

  • Author

    Cao, Yuan ; He, Haibo ; Shen, Xiaoping

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Stevens Inst. of Technol., Hoboken, NJ, USA
  • fYear
    2009
  • fDate
    14-19 June 2009
  • Firstpage
    446
  • Lastpage
    453
  • Abstract
    This paper proposes a novel feature ranking method, DensityRank, based on kernel estimation on the feature spaces to improve the classification performance. As the availability of raw data in many of today´s applications continues to grow at an explosive rate, it is critical to assess the learning capabilities of different features and select the important subset of features to improve learning accuracy as well as reduce computational cost. In our approach, kernel methods are used to estimate the probability density function for each feature across different class labels. Discrepancy analysis based on the mean integrated square error (MISE) between pairs of such density estimations is used to provide the ranking values. Then, the ranked subspace method is adopted to select subsets of important features that are used to develop the learning models. Comparative study of this method with those of traditional ranking methods related to Fisher´s discrimination ratio and information gain theory, as well as the random subspace algorithm and the bootstrap aggregating (bagging), are presented in this paper. Simulation results on various real-world data sets illustrate the effectiveness of the proposed method.
  • Keywords
    data mining; estimation theory; learning (artificial intelligence); mean square error methods; probability; DensityRank; Fisher discrimination ratio; bootstrap aggregation; classification performance; data mining; density estimation; discrepancy analysis; feature ranking; information gain theory; kernel estimation; learning accuracy; learning capability; mean integrated square error; probability density function; random subspace algorithm; ranked subspace; Computational efficiency; Data mining; Decision trees; Filters; Helium; Kernel; Machine learning; Mathematics; Neural networks; Performance analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2009. IJCNN 2009. International Joint Conference on
  • Conference_Location
    Atlanta, GA
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-3548-7
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2009.5178582
  • Filename
    5178582