• DocumentCode
    2457372
  • Title

    Discovering Conservation Rules

  • Author

    Golab, Lukasz ; Karloff, Howard ; Korn, Flip ; Saha, Barna ; Srivastava, Divesh

  • Author_Institution
    Univ. of Waterloo, Waterloo, ON, Canada
  • fYear
    2012
  • fDate
    1-5 April 2012
  • Firstpage
    738
  • Lastpage
    749
  • Abstract
    Many applications process data in which there exists a ``conservation law´´ between related quantities. For example, in traffic monitoring, every incoming event, such as a packet´s entering a router or a car´s entering an intersection, should ideally have an immediate outgoing counterpart. We propose a new class of constraints -- Conservation Rules -- that express the semantics and characterize the data quality of such applications. We give confidence metrics that quantify how strongly a conservation rule holds and present approximation algorithms (with error guarantees) for the problem of discovering a concise summary of subsets of the data that satisfy a given conservation rule. Using real data, we demonstrate the utility of conservation rules and we show order-of-magnitude performance improvements of our discovery algorithms over naive approaches.
  • Keywords
    approximation theory; data handling; approximation algorithm; confidence metrics; conservation law; conservation rules; data quality; discovery algorithm; error guarantees; naive approach; Approximation algorithms; Delay; Electricity; IP networks; Monitoring; Roads;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2012 IEEE 28th International Conference on
  • Conference_Location
    Washington, DC
  • ISSN
    1063-6382
  • Print_ISBN
    978-1-4673-0042-1
  • Type

    conf

  • DOI
    10.1109/ICDE.2012.105
  • Filename
    6228129