Title :
Entity Resolution with Markov Logic
Author :
Singla, Parag ; Domingos, Pedro
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Washington, Seattle, WA
Abstract :
Entity resolution is the problem of determining which records in a database refer to the same entities, and is a crucial and expensive step in the data mining process. Interest in it has grown rapidly, and many approaches have been proposed. However, they tend to address only isolated aspects of the problem, and are often ad hoc. This paper proposes a well-founded, integrated solution to the entity resolution problem based on Markov logic. Markov logic combines first-order logic and probabilistic graphical models by attaching weights to first-order formulas, and viewing them as templates for features of Markov networks. We show how a number of previous approaches can be formulated and seamlessly combined in Markov logic, and how the resulting learning and inference problems can be solved efficiently. Experiments on two citation databases show the utility of this approach, and evaluate the contribution of the different components.
Keywords :
Markov processes; data mining; database management systems; entity-relationship modelling; formal logic; graph theory; inference mechanisms; learning (artificial intelligence); probability; Markov logic; Markov networks; citation databases; data mining; entity resolution; first-order logic; inference problems; learning problems; probabilistic graphical models; Computer science; Couplings; Data engineering; Data mining; Graphical models; Joining processes; Logistics; Markov random fields; Probabilistic logic; Spatial databases;
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2701-7
DOI :
10.1109/ICDM.2006.65