DocumentCode :
2182430
Title :
Duplicate detection in probabilistic data
Author :
Panse, Fabian ; Van Keulen, Maurice ; De Keijzer, Ander ; Ritter, Norbert
Author_Institution :
Comput. Sci. Dept., Univ. of Hamburg, Hamburg, Germany
fYear :
2010
fDate :
1-6 March 2010
Firstpage :
179
Lastpage :
182
Abstract :
Collected data often contains uncertainties. Probabilistic databases have been proposed to manage uncertain data. To combine data from multiple autonomous probabilistic databases, an integration of probabilistic data has to be performed. Until now, however, data integration approaches have focused on the integration of certain source data (relational or XML). There is no work on the integration of uncertain source data so far. In this paper, we present a first step towards a concise consolidation of probabilistic data. We focus on duplicate detection as a representative and essential step in an integration process. We present techniques for identifying multiple probabilistic representations of the same real-world entities.
Keywords :
XML; probability; relational databases; XML data; autonomous probabilistic databases; data integration approach; duplicate detection; probabilistic representations; relational data; uncertain data management; Astronomy; Computer science; Couplings; Data models; Electrostatic precipitators; Prototypes; Relational databases; Telescopes; Uncertainty; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering Workshops (ICDEW), 2010 IEEE 26th International Conference on
Conference_Location :
Long Beach, CA
Print_ISBN :
978-1-4244-6522-4
Electronic_ISBN :
978-1-4244-6521-7
Type :
conf
DOI :
10.1109/ICDEW.2010.5452759
Filename :
5452759
Link To Document :
بازگشت