• DocumentCode
    2124903
  • Title

    Identification of Core, Semi-Core and Redundant Attributes of a Dataset

  • Author

    Hashemi, Ray R. ; Bahrami, Azita ; Smith, Mark ; Young, Simon

  • Author_Institution
    Dept. of Comput. Sci., Armstrong Atlantic State Univ., Savannah, GA, USA
  • fYear
    2011
  • fDate
    11-13 April 2011
  • Firstpage
    580
  • Lastpage
    584
  • Abstract
    Data reduction is an essential step in pre-processing of a dataset and it is necessary for improving data quality and obtaining the relevant data from the dataset. Data reduction is performed by identifying and removing redundant attributes of the dataset. However, every non-redundant attribute does not have the same level of contribution to the decision (dependent variable). Therefore, the non-redundant attributes may be further divided into two sub-categories of core (attributes that totally contribute to the decision) and semi-core (attributes that partially contribute to the decision) attributes. In this paper, a methodology for separating core, semi-core, and redundant attributes is introduced and tested. The result shows that the proposed methodology has a high potential for use in any generalization process.
  • Keywords
    data reduction; pattern clustering; data quality; data reduction; dataset redundant attribute; Artificial neural networks; Biochemistry; Heating; Magnetic cores; Noise; Rough sets; Stress; Cluster Quality; Core attribute; Data Reduction; Entropy; Information gain; Redundant Attribute; SOM; Semi-core attribute; VSOM clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology: New Generations (ITNG), 2011 Eighth International Conference on
  • Conference_Location
    Las Vegas, NV
  • Print_ISBN
    978-1-61284-427-5
  • Electronic_ISBN
    978-0-7695-4367-3
  • Type

    conf

  • DOI
    10.1109/ITNG.2011.106
  • Filename
    5945301