Title :
EDP-ORD: Efficient distributed/parallel Optimal Rule Discovery
Author :
Ghanem, Sahar M. ; Mohamed, Mona A. ; Nagi, Magdy H.
Author_Institution :
Comput. & Syst. Eng. Dept., Alexandria Univ., Alexandria, Egypt
fDate :
June 28 2011-July 1 2011
Abstract :
Association rule discovery algorithms generate all rules satisfying minimum support and confidence thresholds. These techniques yield too many rules and are infeasible when the minimum support is low. Recently, Li proposed the Optimal Rule Discovery (ORD) algorithm that discovers a family of rule sets that maximizes a range of interestingness metrics, other than the commonly used confidence metric. In addition, the discovered optimal class association rule set is the minimum subset of rules with the same predictive power as the complete class association rule set. Moreover, ORD is significantly more efficient than association rule discovery independent of the data structure and the implementation. Due to the existence of huge amounts of data, it is important to investigate efficient methods for distributed/parallel mining of rules. In this paper, we propose EDP-ORD an efficient distributed/parallel extension of the ORD algorithm. We theoretically disclose a relationship between locally large and globally large rules and use it in reducing the number of generated rules and the exchanged messages at each site/partition. Moreover, we empirically compare EDP-ORD with a naïve distributed/parallel ORD version on five benchmark datasets. The experimental results shows that the reduction in number of generated rules at each site can reach 44% while the reduction in total size of exchanged messages can reach 58%.
Keywords :
data mining; parallel algorithms; EDP-ORD; association rule discovery algorithm; distributed parallel mining; efficient distributed parallel optimal rule discovery; optimal class association rule set; Association rules; Classification algorithms; Distributed databases; Itemsets; Measurement; Silicon; association rule discovery; data mining; distributed rule discovery; parallel rule discovery; rule-based classifiers;
Conference_Titel :
Computers and Communications (ISCC), 2011 IEEE Symposium on
Conference_Location :
Kerkyra
Print_ISBN :
978-1-4577-0680-6
Electronic_ISBN :
1530-1346
DOI :
10.1109/ISCC.2011.5983965