• DocumentCode
    1605046
  • Title

    Learning from soft partitions of data: reducing the variance

  • Author

    Eschrich, Sebastian ; Hall, Lawrence O.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA
  • Volume
    1
  • fYear
    2003
  • Firstpage
    666
  • Abstract
    Distributed machine learning can be realized using a divide and conquer methodology. One such divide and conquer method is learning from soft partitions of data. By examining the decomposition of classifier error into bias and variance terms, we see that learning from smaller partitions of data introduces higher variance. In this paper, we investigate the use of a particular variance reduction technique, randomized C4.5, when learning from soft partitions of data. This approach maintains the distributed nature of the learning algorithm while boosting the overall classification accuracy. Experiments on six machine learning datasets demonstrate the improved accuracy gains by reducing classifier variance. In particular, learning from soft partitions of data can produce more accurate classifiers than using an ensemble of randomized decision trees constructed from the entire dataset, which in turn results in a more accurate classifier than building a single decision tree.
  • Keywords
    data mining; decision trees; divide and conquer methods; fuzzy set theory; learning (artificial intelligence); bias terms; classifier error decomposition; distributed machine learning; divide and conquer methodology; k-means clustering; localized bagging; randomized C4.5; soft partitions of data; variance reduction technique; variance terms; Bagging; Boosting; Classification tree analysis; Computer errors; Computer science; Decision trees; Learning systems; Machine learning; Neurons; Partitioning algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems, 2003. FUZZ '03. The 12th IEEE International Conference on
  • Print_ISBN
    0-7803-7810-5
  • Type

    conf

  • DOI
    10.1109/FUZZ.2003.1209443
  • Filename
    1209443