• DocumentCode
    3520101
  • Title

    Data Integration on Multiple Data Sets

  • Author

    Mi, Tian ; Aseltine, Robert ; Rajasekaran, Sanguthevar

  • Author_Institution
    Dept. of CSE, Univ. of Connecticut, Storrs, CT
  • fYear
    2008
  • fDate
    3-5 Nov. 2008
  • Firstpage
    443
  • Lastpage
    446
  • Abstract
    A critical issue in dealing with voluminous records is that of data integration. Integration of data from two data bases has been studied well. For example, FEBRL is an excellent system for integrating two databases. Not much work has been conducted to integrate more than two databases. In practice, for example, health care networks have to often integrate many more databases than two. In this paper we offer hierarchical clustering based solutions to integrate multiple data sets. We also present experimental data that indicate that our algorithms perform well.
  • Keywords
    bioinformatics; data handling; database management systems; health care; medical administrative data processing; pattern clustering; FEBRL; Freely Extensible Biomedical Record Linkage; data integration; health care networks; hierarchical clustering; multiple data sets; Bioinformatics; Clustering algorithms; Costs; Databases; Dynamic programming; Employment; Government; Medical services; Public healthcare; USA Councils; data deduplication; data integration; multiple data sets;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine, 2008. BIBM '08. IEEE International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    978-0-7695-3452-7
  • Type

    conf

  • DOI
    10.1109/BIBM.2008.48
  • Filename
    4684936