Title :
Uniform techniques for deriving similarities of objects and subschemes in heterogeneous databases
Author :
Palopoli, Luigi ; Saccà, Domenico ; Terracina, Giorgio ; Ursino, Domenico
Author_Institution :
D.I.M.E.T., Universita "Mediterranea" di Reggio Calabria, Italy
Abstract :
The availability of automatic tools for inferring semantics of database schemes is useful to solve several database design problems such as that of obtaining cooperative information systems or data warehouses from large sets of data sources. In this context, a main problem is to single out similarities or dissimilarities among scheme objects (interscheme properties). This paper presents graph-based techniques for a uniform derivation of interscheme properties including synonymies, homonymies, type conflicts, and subscheme similarities. These techniques are characterized by a common core: the computation of maximum weight matchings on some bipartite weighted graphs derived using a suitable metrics to measure semantic closeness of objects. The techniques have been implemented in a system prototype. Several experiments conducted with it, and (in part) accounted for in the paper, confirmed the effectiveness of our approach.
Keywords :
data warehouses; distributed databases; bipartite weighted graphs; cooperative information systems; data sources; data warehouses; database schemes; graph-based techniques; heterogeneous databases; homonymies; maximum weight matchings; metrics; object similarities; semantic closeness; semantics; subscheme similarities; synonymies; type conflicts; Character recognition; Data warehouses; Databases; Dictionaries; High definition video; Humans; Information systems; Management information systems; Object detection; Prototypes;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2003.1185834