• DocumentCode
    2660252
  • Title

    The preprocessing in census data with concept hierarchy

  • Author

    Bin, Sheng ; Sun, Gengxin

  • Author_Institution
    Coll. of Inf. Sci. & Eng., Qingdao Univ., Qingdao, China
  • Volume
    1
  • fYear
    2010
  • fDate
    16-18 April 2010
  • Abstract
    Data Mining can extract implicit, previously unknown, and potentially useful information from data. Census is a significant investigation of national conditions and national power. Using Data Mining in census data has very high learning value and vast marketplace space. While census data are being collected and accumulated at dramatically high rates, concept hierarchy, one of the Data Mining techniques, can be used to reduce the data by collecting and replacing low-level concepts with higher-level concepts. Although detail is lost by such data generalization, the generalized data may be more meaningful and easier to interpret. Mining on a reduced data set can improve the quality of mining object and the obtained patterns after mining process. In this paper we apply concept hierarchy to preprocess the census data in Chengyang and Laixi, choose the dynamic concept hierarchy adjustment algorithm to adjust the obtained concept hierarchy on the attribute of “housing construction cost”, then evaluate the results, make preparation for the next step in mining process.
  • Keywords
    data mining; census data preprocessing; data generalization; data mining techniques; dynamic concept hierarchy adjustment algorithm; housing construction cost; Data engineering; Data mining; Data preprocessing; Databases; Educational institutions; Environmental economics; Information science; Information systems; Mining industry; Power generation economics; Census; Concept Hierarchy; Data Mining; Dynamic Hierarchy Adjustment;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Engineering and Technology (ICCET), 2010 2nd International Conference on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-1-4244-6347-3
  • Type

    conf

  • DOI
    10.1109/ICCET.2010.5485990
  • Filename
    5485990