• DocumentCode
    2063914
  • Title

    A preliminary study on overlapping and data fracture in imbalanced domains by means of Genetic Programming-based feature extraction

  • Author

    Moreno-Torres, Jose G. ; Herrera, Francisco

  • Author_Institution
    Dept. of Comput. Sci. & Artificial Intell., Univ. de Granada, Granada, Spain
  • fYear
    2010
  • fDate
    Nov. 29 2010-Dec. 1 2010
  • Firstpage
    501
  • Lastpage
    506
  • Abstract
    The classification of imbalanced data is a well-studied topic in data mining. However, there is still a lack of understanding of the factors that make the problem difficult. In this work, we study the two main reasons that make the classification of imbalanced datasets complex: overlapping and data fracture. We present a Genetic Programming-based feature extraction method driven by Rough Set Theory to help visualize the data in a bidimensional graph, to better understand how the presence of overlapping and data fractures affect classification performance.
  • Keywords
    data mining; feature extraction; genetic algorithms; pattern classification; rough set theory; bidimensional graph; data fracture; data mining; genetic programming-based feature extraction; imbalanced data classification; rough set theory; data fracture; feature extraction; genetic programming; imbalanced data; overlapping; rough set theory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems Design and Applications (ISDA), 2010 10th International Conference on
  • Conference_Location
    Cairo
  • Print_ISBN
    978-1-4244-8134-7
  • Type

    conf

  • DOI
    10.1109/ISDA.2010.5687214
  • Filename
    5687214