• DocumentCode
    773594
  • Title

    GP ensembles for large-scale data classification

  • Author

    Folino, Gianluigi ; Pizzuti, Clara ; Spezzano, Giandomenico

  • Author_Institution
    ICAR-CNR, Rende
  • Volume
    10
  • Issue
    5
  • fYear
    2006
  • Firstpage
    604
  • Lastpage
    616
  • Abstract
    An extension of cellular genetic programming for data classification (CGPC) to induce an ensemble of predictors is presented. Two algorithms implementing the bagging and boosting techniques are described and compared with CGPC. The approach is able to deal with large data sets that do not fit in main memory since each classifier is trained on a subset of the overall training data. The predictors are then combined to classify new tuples. Experiments on several data sets show that, by using a training set of reduced size, better classification accuracy can be obtained, but at a much lower computational cost
  • Keywords
    data mining; genetic algorithms; very large databases; bagging technique; boosting techniques; cellular genetic programming; data mining; large-scale data classification; Bagging; Boosting; Classification tree analysis; Computational efficiency; Data mining; Decision trees; Genetic programming; Large-scale systems; Training data; Voting; Bagging; boosting; classification; data mining; genetic programming (GP);
  • fLanguage
    English
  • Journal_Title
    Evolutionary Computation, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1089-778X
  • Type

    jour

  • DOI
    10.1109/TEVC.2005.863627
  • Filename
    1705406