Title :
RuleMR: Classification rule discovery with MapReduce
Author :
Kolias, Vasilis ; Kolias, Constantinos ; Anagnostopoulos, Ioannis ; Kayafas, Eleftherios
Author_Institution :
Nat. Tech. Univ. of Athens, Athens, Greece
Abstract :
The vast amounts of data generated, exchanged and consumed on a daily basis by contemporary networks and devices renders their analysis a cumbersome procedure with inherent difficulties. On the one hand, the need for efficient Machine Learning algorithms and tools that scale on large datasets is continuously growing. On the other, parallel or distributed solutions have proven to conceal many pitfalls. The MapReduce programming model has quickly emerged as the de facto model for executing simple algorithmic tasks over huge volumes of data, since it is simple, highly abstract and efficient. However, due to its unidirectional communication model and the inherent lack of support for iterative execution, few Machine Learning algorithms can easily be implemented on MapReduce. In this paper, we present a classification rule discovery algorithm, namely RuleMR, which despite its iterative nature, can capitalize on MapReduce. In order to construct quality rules in less iterations, the algorithm exploits the distributed nature of MapReduce to explore only the promising areas in the search space. We conduct a series of experimental evaluations which indicate that the proposed approach not only scales well with respect to the size of the training dataset, but also, in many cases, the resulting model is comparable to many well known algorithms in matters of accuracy.
Keywords :
data mining; pattern classification; MapReduce programming model; RuleMR; classification rule discovery algorithm; quality rules; big data; classification; machine learning; mapreduce; rule induction;
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/BigData.2014.7004440