• DocumentCode
    1791761
  • Title

    RuleMR: Classification rule discovery with MapReduce

  • Author

    Kolias, Vasilis ; Kolias, Constantinos ; Anagnostopoulos, Ioannis ; Kayafas, Eleftherios

  • Author_Institution
    Nat. Tech. Univ. of Athens, Athens, Greece
  • fYear
    2014
  • fDate
    27-30 Oct. 2014
  • Firstpage
    20
  • Lastpage
    28
  • Abstract
    The vast amounts of data generated, exchanged and consumed on a daily basis by contemporary networks and devices renders their analysis a cumbersome procedure with inherent difficulties. On the one hand, the need for efficient Machine Learning algorithms and tools that scale on large datasets is continuously growing. On the other, parallel or distributed solutions have proven to conceal many pitfalls. The MapReduce programming model has quickly emerged as the de facto model for executing simple algorithmic tasks over huge volumes of data, since it is simple, highly abstract and efficient. However, due to its unidirectional communication model and the inherent lack of support for iterative execution, few Machine Learning algorithms can easily be implemented on MapReduce. In this paper, we present a classification rule discovery algorithm, namely RuleMR, which despite its iterative nature, can capitalize on MapReduce. In order to construct quality rules in less iterations, the algorithm exploits the distributed nature of MapReduce to explore only the promising areas in the search space. We conduct a series of experimental evaluations which indicate that the proposed approach not only scales well with respect to the size of the training dataset, but also, in many cases, the resulting model is comparable to many well known algorithms in matters of accuracy.
  • Keywords
    data mining; pattern classification; MapReduce programming model; RuleMR; classification rule discovery algorithm; quality rules; big data; classification; machine learning; mapreduce; rule induction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2014 IEEE International Conference on
  • Conference_Location
    Washington, DC
  • Type

    conf

  • DOI
    10.1109/BigData.2014.7004440
  • Filename
    7004440