• DocumentCode
    3703556
  • Title

    P-N-RMiner: A generic framework for mining interesting structured relational patterns

  • Author

    Jefrey Lijffijt;Eirini Spyropoulou;Bo Kang;Tijl De Bie

  • Author_Institution
    Intelligent Systems Lab, University of Bristol, UK
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    Local pattern mining methods are fragmented along two dimensions: the pattern syntax, and the data types on which they are applicable. Pattern syntaxes considered in the literature include subgroups, n-sets, itemsets, and many more; common data types include binary, categorical, and real-valued. Recent research on pattern mining in relational databases has shown how the aforementioned pattern syntaxes can be unified in a single framework. However, a unified understanding of how to deal with various data types is lacking, certainly for more complexly structured types such as time of day (which is circular), geographical location, terms from a taxonomy, etc. In this paper, we introduce a generic approach for mining interesting local patterns in (relational) data involving such structured data types as attributes. Importantly, we show how this can be done in a generic manner, by modelling the structure within a set of attribute values as a partial order. We then derive a measure of subjective interestingness of such patterns using Information Theory, and propose an algorithm for effectively enumerating all patterns of this syntax. Through empirical evaluation, we found that (a) the new interestingness derivation is relevant and cannot be approximated using existing tools, (b) the new tool, P-N-RMiner, finds patterns that are substantially more informative, and (c) the new enumeration algorithm is considerably faster.
  • Keywords
    "Data mining","Syntactics","Relational databases","Taxonomy","Itemsets","Approximation algorithms"
  • Publisher
    ieee
  • Conference_Titel
    Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on
  • Print_ISBN
    978-1-4673-8272-4
  • Type

    conf

  • DOI
    10.1109/DSAA.2015.7344837
  • Filename
    7344837