DocumentCode
773594
Title
GP ensembles for large-scale data classification
Author
Folino, Gianluigi ; Pizzuti, Clara ; Spezzano, Giandomenico
Author_Institution
ICAR-CNR, Rende
Volume
10
Issue
5
fYear
2006
Firstpage
604
Lastpage
616
Abstract
An extension of cellular genetic programming for data classification (CGPC) to induce an ensemble of predictors is presented. Two algorithms implementing the bagging and boosting techniques are described and compared with CGPC. The approach is able to deal with large data sets that do not fit in main memory since each classifier is trained on a subset of the overall training data. The predictors are then combined to classify new tuples. Experiments on several data sets show that, by using a training set of reduced size, better classification accuracy can be obtained, but at a much lower computational cost
Keywords
data mining; genetic algorithms; very large databases; bagging technique; boosting techniques; cellular genetic programming; data mining; large-scale data classification; Bagging; Boosting; Classification tree analysis; Computational efficiency; Data mining; Decision trees; Genetic programming; Large-scale systems; Training data; Voting; Bagging; boosting; classification; data mining; genetic programming (GP);
fLanguage
English
Journal_Title
Evolutionary Computation, IEEE Transactions on
Publisher
ieee
ISSN
1089-778X
Type
jour
DOI
10.1109/TEVC.2005.863627
Filename
1705406
Link To Document