• DocumentCode
    2730863
  • Title

    Conditional Functional Dependencies for Data Cleaning

  • Author

    Bohannon, P. ; Wenfei Fan ; Geerts, F. ; Xibei Jia ; Kementsietsidis, Anastasios

  • Author_Institution
    Yahoo Res., Yahoo Inc., Dallas, TX, USA
  • fYear
    2007
  • fDate
    15-20 April 2007
  • Firstpage
    746
  • Lastpage
    755
  • Abstract
    We propose a class of constraints, referred to as conditional functional dependencies (CFDs), and study their applications in data cleaning. In contrast to traditional functional dependencies (FDs) that were developed mainly for schema design, CFDs aim at capturing the consistency of data by incorporating bindings of semantic ally related values. For CFDs we provide an inference system analogous to Armstrong´s axioms for FDs, as well as consistency analysis. Since CFDs allow data bindings, a large number of individual constraints may hold on a table, complicating detection of constraint violations. We develop techniques for detecting CFD violations in SQL as well as novel techniques for checking multiple constraints in a single query. We experimentally evaluate the performance of our CFD-based methods for inconsistency detection. This not only yields a constraint theory for CFDs but is also a step toward a practical constraint-based method for improving data quality.
  • Keywords
    SQL; data handling; data warehouses; Armstrong axioms; SQL; conditional functional dependencies; data bindings; data cleaning; data quality; Business; Cities and towns; Cleaning; Computational fluid dynamics; Constraint theory; Cost function; Data mining; Data warehouses; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
  • Conference_Location
    Istanbul
  • Print_ISBN
    1-4244-0802-4
  • Type

    conf

  • DOI
    10.1109/ICDE.2007.367920
  • Filename
    4221723