• DocumentCode
    3666566
  • Title

    A proposition for resilient graph-based record linkage using parallel processing on distributed networks

  • Author

    Joseph Jupin;Justin Y. Shi

  • Author_Institution
    Computer and Information Sciences Department Temple University Philadelphia, PA, USA
  • fYear
    2015
  • fDate
    8/1/2015 12:00:00 AM
  • Firstpage
    1
  • Lastpage
    3
  • Abstract
    Industrial and governmental organizations have accrued vast amounts of data contained in many databases. Many of these databases are developed by different organizations for different purposes, may contain millions of unique entities and may lack a dependable global unique identifier to link an individual´s records across multiple databases. Record Linkage (RL) is a process that connects records that are related to the identical or sufficiently similar entity from multiple heterogeneous databases [1]. Whether the RL system uses a deterministic or probabilistic [2] methodology, it is necessary to compare the data within each pair of candidate records, field-by-field. Demographic and other data is used for pattern matches to determine if two records belong to the same entity. RL is a data and compute intensive mission critical process for many organizations. The process must be efficient enough to process big data, effective enough to provide accurate matches and resilient enough to ensure reliable operation.
  • Keywords
    "Servers","Databases","Approximation algorithms","Couplings","Random access memory","Parallel processing","Computers"
  • Publisher
    ieee
  • Conference_Titel
    Resilience Week (RWS), 2015
  • Type

    conf

  • DOI
    10.1109/RWEEK.2015.7287442
  • Filename
    7287442