DocumentCode
657982
Title
Similar data elimination: MFB algorithm
Author
Boufares, Faouzi ; Ben Salem, Aicha ; Rehab, Moufida ; Correia, Sebastiao
Author_Institution
Lab. LIPN, Univ. Paris 13, Villetaneuse, France
fYear
2013
fDate
6-8 May 2013
Firstpage
289
Lastpage
293
Abstract
Nowadays, the complex applications such as knowledge extraction, data mining, E-learning and web applications use heterogeneous and distributed data. In this context, the need for integration and improving data quality is increasingly felt. The problem of eliminating duplicates and similar data is still relevant in terms of both performance and in terms of the definition of similarity rules. We present in this paper a new deduplication algorithm based on the two functions Match and Merge. An evaluation is made experimentally using a set of randomly generated data.
Keywords
data analysis; data integration; data mining; merging; MFB algorithm; Match function; Merge function; Web applications; data integration; data mining; data quality improvement; deduplication algorithm; distributed data; duplicate elimination; e-learning; heterogeneous data; knowledge extraction; similar data elimination; similarity rules; Cleaning; Companies; Couplings; Data mining; Knowledge discovery; Semantics; Switches; Data Quality; Deduplication; Duplicates; Match; Merge; Similar Data;
fLanguage
English
Publisher
ieee
Conference_Titel
Control, Decision and Information Technologies (CoDIT), 2013 International Conference on
Conference_Location
Hammamet
Print_ISBN
978-1-4673-5547-6
Type
conf
DOI
10.1109/CoDIT.2013.6689559
Filename
6689559
Link To Document