DocumentCode
3666566
Title
A proposition for resilient graph-based record linkage using parallel processing on distributed networks
Author
Joseph Jupin;Justin Y. Shi
Author_Institution
Computer and Information Sciences Department Temple University Philadelphia, PA, USA
fYear
2015
fDate
8/1/2015 12:00:00 AM
Firstpage
1
Lastpage
3
Abstract
Industrial and governmental organizations have accrued vast amounts of data contained in many databases. Many of these databases are developed by different organizations for different purposes, may contain millions of unique entities and may lack a dependable global unique identifier to link an individual´s records across multiple databases. Record Linkage (RL) is a process that connects records that are related to the identical or sufficiently similar entity from multiple heterogeneous databases [1]. Whether the RL system uses a deterministic or probabilistic [2] methodology, it is necessary to compare the data within each pair of candidate records, field-by-field. Demographic and other data is used for pattern matches to determine if two records belong to the same entity. RL is a data and compute intensive mission critical process for many organizations. The process must be efficient enough to process big data, effective enough to provide accurate matches and resilient enough to ensure reliable operation.
Keywords
"Servers","Databases","Approximation algorithms","Couplings","Random access memory","Parallel processing","Computers"
Publisher
ieee
Conference_Titel
Resilience Week (RWS), 2015
Type
conf
DOI
10.1109/RWEEK.2015.7287442
Filename
7287442
Link To Document