DocumentCode
3520101
Title
Data Integration on Multiple Data Sets
Author
Mi, Tian ; Aseltine, Robert ; Rajasekaran, Sanguthevar
Author_Institution
Dept. of CSE, Univ. of Connecticut, Storrs, CT
fYear
2008
fDate
3-5 Nov. 2008
Firstpage
443
Lastpage
446
Abstract
A critical issue in dealing with voluminous records is that of data integration. Integration of data from two data bases has been studied well. For example, FEBRL is an excellent system for integrating two databases. Not much work has been conducted to integrate more than two databases. In practice, for example, health care networks have to often integrate many more databases than two. In this paper we offer hierarchical clustering based solutions to integrate multiple data sets. We also present experimental data that indicate that our algorithms perform well.
Keywords
bioinformatics; data handling; database management systems; health care; medical administrative data processing; pattern clustering; FEBRL; Freely Extensible Biomedical Record Linkage; data integration; health care networks; hierarchical clustering; multiple data sets; Bioinformatics; Clustering algorithms; Costs; Databases; Dynamic programming; Employment; Government; Medical services; Public healthcare; USA Councils; data deduplication; data integration; multiple data sets;
fLanguage
English
Publisher
ieee
Conference_Titel
Bioinformatics and Biomedicine, 2008. BIBM '08. IEEE International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
978-0-7695-3452-7
Type
conf
DOI
10.1109/BIBM.2008.48
Filename
4684936
Link To Document