DocumentCode :
3739163
Title :
Unsupervised Measuring of Entity Resolution Consistency
Author :
Jeffrey Fisher;Qing Wang
Author_Institution :
Res. Sch. of Comput. Sci., Australian Nat. Univ., Acton, ACT, Australia
fYear :
2015
Firstpage :
218
Lastpage :
221
Abstract :
Entity resolution (ER) is a common data cleaning and data-integration task that aims to determine which records in one or more data sets refer to the same real-world entities. In most cases no training data exists and the ER process involves considerable trial and error, with an often time-consuming manual evaluation required to determine whether the obtained results are good enough. We propose a method that makes use of transitive closure within triples of records to provide an early indication of inconsistency in an ER result in an unsupervised fashion. We test our approach on three real-world data sets with different similarity calculations and blocking approaches and show that our approach can detect problems with ER resultsearly on without a manual evaluation.
Keywords :
"Erbium","Manuals","Indexing","Cities and towns","Approximation algorithms","IP networks","Integrated circuits"
Publisher :
ieee
Conference_Titel :
Data Mining Workshop (ICDMW), 2015 IEEE International Conference on
Electronic_ISBN :
2375-9259
Type :
conf
DOI :
10.1109/ICDMW.2015.162
Filename :
7395674
Link To Document :
بازگشت