• DocumentCode
    2350412
  • Title

    MReC4.5: C4.5 Ensemble Classification with MapReduce

  • Author

    Wu, Gongqing ; Li, Haiguang ; Hu, Xuegang ; Bi, Yuanjun ; Zhang, Jing ; Wu, Xindong

  • Author_Institution
    Sch. of Comput. Sci. & Inf. Eng., Hefei Univ. of Technol., Hefei, China
  • fYear
    2009
  • fDate
    21-22 Aug. 2009
  • Firstpage
    249
  • Lastpage
    255
  • Abstract
    Classification is a significant technique in data mining research and applications. C4.5 is a widely used classification method, and ensemble learning adopts a parallel and distributed computing model for classification. Based on analyses of the MapReduce computing paradigm and the process of ensemble learning, we find that the parallel and distributed computing model in MapReduce is appropriate for implementing ensemble learning. This paper takes the advantages of C4.5, ensemble learning and the MapReduce computing model, and proposes a new method MReC4.5 for parallel and distributed ensemble classification. Our experimental results show that increasing the number of nodes would benefit the effectiveness of classification modeling, and serialization operations at the model level make the MReC4.5 classifier "construct once, use anywhere".
  • Keywords
    data mining; decision trees; grid computing; learning (artificial intelligence); C4.5 ensemble classification; MReC4.5; MapReduce; classification; data mining; distributed computing; ensemble learning; parallel computing; serialization operations; Classification algorithms; Cloud computing; Computer science; Concurrent computing; Data mining; Decision trees; Distributed computing; Parallel programming; Testing; Training data; Distributed computing; MapReduce; classification; data mining; ensemble learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    ChinaGrid Annual Conference, 2009. ChinaGrid '09. Fourth
  • Conference_Location
    Yantai, Shandong
  • Print_ISBN
    978-0-7695-3818-1
  • Type

    conf

  • DOI
    10.1109/ChinaGrid.2009.39
  • Filename
    5329047