• DocumentCode
    34277
  • Title

    Discovering Conservation Rules

  • Author

    Golab, Lukasz ; Karloff, Howard ; Korn, Flip ; Saha, Balaram ; Srivastava, Divesh

  • Author_Institution
    Dept. of Eng., Univ. of Waterloo, Waterloo, ON, Canada
  • Volume
    26
  • Issue
    6
  • fYear
    2014
  • fDate
    Jun-14
  • Firstpage
    1332
  • Lastpage
    1348
  • Abstract
    Many applications process data in which there exists a “conservation law” between related quantities. For example, in traffic monitoring, every incoming event, such as a packet´s entering a router or a car´s entering an intersection, should ideally have an immediate outgoing counterpart. We propose a new class of constraints-Conservation Rules-that express the semantics and characterize the data quality of such applications. We give confidence metrics that quantify how strongly a conservation rule holds and present approximation algorithms (with error guarantees) for the problem of discovering a concise summary of subsets of the data that satisfy a given conservation rule. Using real data, we demonstrate the utility of conservation rules and we show order-of-magnitude performance improvements of our discovery algorithms over naive approaches.
  • Keywords
    approximation theory; data mining; approximation algorithms; confidence metrics; conservation law; conservation rule discovery; data mining; data process; performance improvements; traffic monitoring; Approximation algorithms; Database systems; Electricity; IP networks; Monitoring; Data mining; Database Applications; Database Management; Database semantics; Information Technology and Systems; Languages; Mining methods and algorithms;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2012.171
  • Filename
    6276207