• DocumentCode
    498377
  • Title

    The Uniformization and the Feature Selection about the Inconsistent Classification Data Set

  • Author

    Wu, Xin-Ling ; He, Dong-Feng ; Zhou, Guo-Qiang

  • Author_Institution
    Inst. of Comput. Sci., GuangDong Polytech. Normal Univ., Guangzhou, China
  • Volume
    2
  • fYear
    2009
  • fDate
    19-21 May 2009
  • Firstpage
    496
  • Lastpage
    500
  • Abstract
    The inconsistency and redundant attributes of a sample data set will drop the classification quality and efficiency. In this paper, the method that can make the classification data set consistent and select a smallest feature variable set is proposed. This method groups together the inconsistent datum of the most likely category to make the data set uniform, based on Bayesian formula. Then with the uniform data set, a category distinction matrix is built and the smallest feature variable subset that can distinguish the category accurately is obtained through the category distinction matrix. A heuristic search strategy is given to select the feature variables. The experiment results using some UCI standard datasets show the proposed method can eliminate the inconsistency of the sample dataset, select the optimal feature variables and drop the dimension of the data effectively.
  • Keywords
    Bayes methods; data mining; Bayesian formula; UCI standard datasets; category distinction matrix; classification data set; data set uniform; feature selection; heuristic search strategy; optimal feature variables; uniformization; Bayesian methods; Data mining; Educational technology; Fuzzy sets; Helium; Intelligent systems; Probability; Statistical analysis; Testing; Bayesian formula; classification; data consistency; data mining; data reduction; feature selection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems, 2009. GCIS '09. WRI Global Congress on
  • Conference_Location
    Xiamen
  • Print_ISBN
    978-0-7695-3571-5
  • Type

    conf

  • DOI
    10.1109/GCIS.2009.299
  • Filename
    5209383