• DocumentCode
    3603463
  • Title

    Extending Conditional Dependencies with Built-in Predicates

  • Author

    Shuai Ma ; Liang Duan ; Wenfei Fan ; Chunming Hu ; Wenguang Chen

  • Author_Institution
    SKLSDE Lab., Beihang Univ., Beijing, China
  • Volume
    27
  • Issue
    12
  • fYear
    2015
  • Firstpage
    3274
  • Lastpage
    3288
  • Abstract
    This paper proposes a natural extension of conditional functional dependencies (CFDs [1]) and conditional inclusion dependencies (CINDs [2]), denoted by CFDps and CINDps, respectively, by specifying patterns of data values with 6 ≠, <;,≤, >, and ≥ predicates. As data quality rules, CFDps and CINDps are able to capture errors that commonly arise in practice but cannot be detected by CFDs and CINDs. We establish two sets of results for central technical problems associated with CFDps and CINDps. (a) One concerns the satisfiability and implication problems for CFDps and CINDps, taken separately or together. These are important for, e.g. deciding whether data quality rules are dirty themselves, and for removing redundant rules. We show that despite the increased expressive power, the static analyses of CFDps and CINDps retain the same complexity as their CFDs and CINDs counterparts. (b) The other concerns validation of CFDps and CINDps. We show that given a set X of CFDps and CINDps on a database D, a set of SQL queries can be automatically generated that, when evaluated against D, return all tuples in D that violate some dependencies in Σ. We also experimentally verified the efficiency and effectiveness of our SQL based error detection techniques, using real-life data. This provides commercial DBMS with an immediate capability to detect errors based on CFDps and CINDps.
  • Keywords
    SQL; computational complexity; data analysis; query processing; relational databases; CFDp; CINDp; SQL based error detection techniques; SQL queries; central technical problems; complexity analysis; conditional functional dependencies; conditional inclusion dependencies; data quality rules; data value pattern specification; predicates; static analysis; tuples; Complexity theory; Computational fluid dynamics; Databases; Pattern matching; Phase frequency detector; Semantics; Conditional dependencies; built-in predicates; data quality; functional dependencies; inclusion dependencies;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2015.2451632
  • Filename
    7145438