DocumentCode :
3576305
Title :
SIER: An Efficient Entity Resolution Mechanism Combining SNM and Iteration
Author :
Taiming Wang ; Yue Kou ; Derong Shen ; Heng Liu ; Ge Yu
Author_Institution :
Coll. of Inf. Sci. & Eng., Northeastern Univ., Shenyang, China
fYear :
2014
Firstpage :
238
Lastpage :
241
Abstract :
With the rapid increase of data, entity resolution (ER) faces two challenges: high quality and high performance. Correspondingly, current work focuses on iteration-based entity resolution or sorted neighborhood (SNM) - based entity resolution. The former iteratively merges similar records to acquire higher precision and recall. The latter only compares the records within the same sliding window to maintain higher performance. However, they are at the cost of either sacrificing efficiency or result quality. In this paper, we present an entity resolution mechanism combining SNM and iteration (called SIER). Unlike traditional approaches, SIER can fully exploit the advantages of SNM and iteration. Also a two-stage entity matching algorithm is proposed. In the first stage, the records are initially matched based on sliding window. In the second stage, the matching result is rectified iteratively to improve the quality of the result. The experiments demonstrate the feasibility and effectiveness of our method.
Keywords :
data handling; iterative methods; ER; SIER; SNM; entity resolution mechanism; iteration-based entity resolution; result quality; sliding window; sorted neighborhood-based entity resolution; two-stage entity matching algorithm; Clustering algorithms; Couplings; Educational institutions; Erbium; Iterative methods; Merging; Sorting; iterative entity resolution; sliding window; sorted neighborhood;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Information System and Application Conference (WISA), 2014 11th
Print_ISBN :
978-1-4799-5726-2
Type :
conf
DOI :
10.1109/WISA.2014.50
Filename :
7058019
Link To Document :
بازگشت