• DocumentCode
    2353504
  • Title

    Bagging is a small-data-set phenomenon

  • Author

    Chawla, N. ; Moore, T.E., Jr. ; Bowyer, K.W. ; Hall, L.O. ; Springer, C. ; Kegelmeyer, P.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA
  • Volume
    2
  • fYear
    2001
  • fDate
    8-14 Dec. 2001
  • Abstract
    Bagging forms a committee of classifiers by bootstrap aggregation of training sets from a pool of training data. A simple alternative to bagging is to partition the data into disjoint subsets. Experiments on various datasets show that, given the same size partitions and bags, disjoint partitions result in better performance than bootstrap aggregates (bags). Many applications (e.g., protein structure prediction) involve the use of datasets that are too large to handle in the memory of a typical computer. Our results indicate that, in such applications, the simple approach of creating a committee of classifiers from disjoint partitions is preferred over the more complex approach of bagging.
  • Keywords
    data mining; learning (artificial intelligence); pattern classification; bagging; bootstrap aggregation; classifier committee; disjoint partitions; protein structure prediction; small dataset; training data pool; training sets; Aggregates; Application software; Bagging; Computer science; Data mining; Laboratories; Proteins; Sampling methods; Testing; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on
  • Conference_Location
    Kauai, HI, USA
  • ISSN
    1063-6919
  • Print_ISBN
    0-7695-1272-0
  • Type

    conf

  • DOI
    10.1109/CVPR.2001.991030
  • Filename
    991030