DocumentCode :
3761521
Title :
An Automatic Discovery Framework of Cross-Source Data Inconsistency for Web Big Data
Author :
Sha Yang;Wei Yu;Yahui Hu;Kai Wang;Jun Wang;Shijun Li
Author_Institution :
Sch. of Comput., Wuhan Univ., Wuhan, China
fYear :
2015
Firstpage :
73
Lastpage :
79
Abstract :
The vigorous growth of big data has triggered both opportunities and challenges in business and industry. However, Web big data distributed in diverse sources with multiple data structures frequently conflict with each other, i.e. inconsistency in cross-source Web big data. In this paper, we propose a state-of-the-art architecture of auto-discovering inconsistency with Web big data. Our contributions include: (1) we classify the inconsistency features to formalize inconsistency data and establish an algebraic operation system, (2) we propose three algorithms to auto-discover inconsistency, including constraint-based, SDA-based and HPDM-based method and (3) we conduct experiments on real-world dataset to compare aforesaid schemes with Oracle-based inconsistency detection framework. The empirical results show that our methods outperform traditional framework both on accuracy and efficiency under Web big data.
Keywords :
"Big data","Data models","Computers","Data mining","Industries","Algorithm design and analysis","Distributed databases"
Publisher :
ieee
Conference_Titel :
Advanced Cloud and Big Data, 2015 Third International Conference on
Print_ISBN :
978-1-4673-8537-4
Type :
conf
DOI :
10.1109/CBD.2015.22
Filename :
7435456
Link To Document :
بازگشت