DocumentCode :
1467235
Title :
Learning-Based Approaches for Matching Web Data Entities
Author :
Köpcke, Hanna ; Thor, Andreas ; Rahm, Erhard
Author_Institution :
Univ. of Leipzig, Leipzig, Germany
Volume :
14
Issue :
4
fYear :
2010
Firstpage :
23
Lastpage :
31
Abstract :
Entity matching is a key task for data integration and especially challenging for Web data. Effective entity matching typically requires combining several match techniques and finding suitable configuration parameters, such as similarity thresholds. The authors investigate to what degree machine learning helps semi-automatically determine suitable match strategies with a limited amount of manual training effort. They use a new framework, Fever, to evaluate several learning-based approaches for matching different sets of Web data entities. In particular, they study different approaches for training-data selection and how much training is needed to find effective combined match strategies and configurations.
Keywords :
Internet; learning (artificial intelligence); pattern matching; Fever; Web data entity matching; data management; entity resolution; fuzzy join; learning-based approaches; machine learning; object matching; similarity thresholds; training-data selection; Web data integration; entity matching; machine learning;
fLanguage :
English
Journal_Title :
Internet Computing, IEEE
Publisher :
ieee
ISSN :
1089-7801
Type :
jour
DOI :
10.1109/MIC.2010.58
Filename :
5445070
Link To Document :
بازگشت