DocumentCode :
3703556
Title :
P-N-RMiner: A generic framework for mining interesting structured relational patterns
Author :
Jefrey Lijffijt;Eirini Spyropoulou;Bo Kang;Tijl De Bie
Author_Institution :
Intelligent Systems Lab, University of Bristol, UK
fYear :
2015
Firstpage :
1
Lastpage :
10
Abstract :
Local pattern mining methods are fragmented along two dimensions: the pattern syntax, and the data types on which they are applicable. Pattern syntaxes considered in the literature include subgroups, n-sets, itemsets, and many more; common data types include binary, categorical, and real-valued. Recent research on pattern mining in relational databases has shown how the aforementioned pattern syntaxes can be unified in a single framework. However, a unified understanding of how to deal with various data types is lacking, certainly for more complexly structured types such as time of day (which is circular), geographical location, terms from a taxonomy, etc. In this paper, we introduce a generic approach for mining interesting local patterns in (relational) data involving such structured data types as attributes. Importantly, we show how this can be done in a generic manner, by modelling the structure within a set of attribute values as a partial order. We then derive a measure of subjective interestingness of such patterns using Information Theory, and propose an algorithm for effectively enumerating all patterns of this syntax. Through empirical evaluation, we found that (a) the new interestingness derivation is relevant and cannot be approximated using existing tools, (b) the new tool, P-N-RMiner, finds patterns that are substantially more informative, and (c) the new enumeration algorithm is considerably faster.
Keywords :
"Data mining","Syntactics","Relational databases","Taxonomy","Itemsets","Approximation algorithms"
Publisher :
ieee
Conference_Titel :
Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on
Print_ISBN :
978-1-4673-8272-4
Type :
conf
DOI :
10.1109/DSAA.2015.7344837
Filename :
7344837
Link To Document :
بازگشت