DocumentCode :
1791761
Title :
RuleMR: Classification rule discovery with MapReduce
Author :
Kolias, Vasilis ; Kolias, Constantinos ; Anagnostopoulos, Ioannis ; Kayafas, Eleftherios
Author_Institution :
Nat. Tech. Univ. of Athens, Athens, Greece
fYear :
2014
fDate :
27-30 Oct. 2014
Firstpage :
20
Lastpage :
28
Abstract :
The vast amounts of data generated, exchanged and consumed on a daily basis by contemporary networks and devices renders their analysis a cumbersome procedure with inherent difficulties. On the one hand, the need for efficient Machine Learning algorithms and tools that scale on large datasets is continuously growing. On the other, parallel or distributed solutions have proven to conceal many pitfalls. The MapReduce programming model has quickly emerged as the de facto model for executing simple algorithmic tasks over huge volumes of data, since it is simple, highly abstract and efficient. However, due to its unidirectional communication model and the inherent lack of support for iterative execution, few Machine Learning algorithms can easily be implemented on MapReduce. In this paper, we present a classification rule discovery algorithm, namely RuleMR, which despite its iterative nature, can capitalize on MapReduce. In order to construct quality rules in less iterations, the algorithm exploits the distributed nature of MapReduce to explore only the promising areas in the search space. We conduct a series of experimental evaluations which indicate that the proposed approach not only scales well with respect to the size of the training dataset, but also, in many cases, the resulting model is comparable to many well known algorithms in matters of accuracy.
Keywords :
data mining; pattern classification; MapReduce programming model; RuleMR; classification rule discovery algorithm; quality rules; big data; classification; machine learning; mapreduce; rule induction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
Type :
conf
DOI :
10.1109/BigData.2014.7004440
Filename :
7004440
Link To Document :
بازگشت