DocumentCode
2353504
Title
Bagging is a small-data-set phenomenon
Author
Chawla, N. ; Moore, T.E., Jr. ; Bowyer, K.W. ; Hall, L.O. ; Springer, C. ; Kegelmeyer, P.
Author_Institution
Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA
Volume
2
fYear
2001
fDate
8-14 Dec. 2001
Abstract
Bagging forms a committee of classifiers by bootstrap aggregation of training sets from a pool of training data. A simple alternative to bagging is to partition the data into disjoint subsets. Experiments on various datasets show that, given the same size partitions and bags, disjoint partitions result in better performance than bootstrap aggregates (bags). Many applications (e.g., protein structure prediction) involve the use of datasets that are too large to handle in the memory of a typical computer. Our results indicate that, in such applications, the simple approach of creating a committee of classifiers from disjoint partitions is preferred over the more complex approach of bagging.
Keywords
data mining; learning (artificial intelligence); pattern classification; bagging; bootstrap aggregation; classifier committee; disjoint partitions; protein structure prediction; small dataset; training data pool; training sets; Aggregates; Application software; Bagging; Computer science; Data mining; Laboratories; Proteins; Sampling methods; Testing; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on
Conference_Location
Kauai, HI, USA
ISSN
1063-6919
Print_ISBN
0-7695-1272-0
Type
conf
DOI
10.1109/CVPR.2001.991030
Filename
991030
Link To Document