DocumentCode
479779
Title
A Mutual-Information-Based Approach to Entity Reconciliation in Heterogeneous Databases
Author
Bao-hua Qiang ; Xi, Jian-qing ; Bao-hua Qiang
Author_Institution
Sch. of Comput. Sci. & Eng., South China Univ. of Technol., Guangzhou
Volume
1
fYear
2008
fDate
12-14 Dec. 2008
Firstpage
666
Lastpage
669
Abstract
Entity reconciliation is crucial to data interoperability in heterogeneous databases. In our previous research works, we proposed an entities matching algorithm based on attribute entropy to identify the corresponding entities, which can resolve the limitations of present main approaches and improve the precision of entities matching obviously. By our further research, we find that some attributes with different importance in identifying the entities will obtain the same weights just according to attribute entropy. So in this paper we employ mutual information to quantify attribute weight due to mutual information well describes the correlation of probability distributions over two attributes. According to this idea, the final entropy computation algorithm and entity reconciliation algorithm based on mutual information are presented. The experimental results on real-world data show that our mutual-information-based approach can obtain better performance.
Keywords
data handling; distributed databases; entropy; open systems; attribute entropy; data interoperability; entities matching algorithm; entity reconciliation; heterogeneous databases; mutual-information-based approach; Computer science; Data engineering; Databases; Distributed computing; Educational institutions; Entropy; Information science; Mutual information; Probability distribution; Software engineering; attribute entropy; entities matching; heterogeneous databases; mutual information;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Software Engineering, 2008 International Conference on
Conference_Location
Wuhan, Hubei
Print_ISBN
978-0-7695-3336-0
Type
conf
DOI
10.1109/CSSE.2008.535
Filename
4721837
Link To Document