DocumentCode :
1879479
Title :
Detecting abnormal data for ontology based information integration
Author :
Yu, Yang ; Heflin, Jeff
Author_Institution :
Dept. of Comput. Sci. & Eng., Lehigh Univ., Bethlehem, PA, USA
fYear :
2011
fDate :
23-27 May 2011
Firstpage :
431
Lastpage :
438
Abstract :
To better support information integration on Semantic Web data with varying degrees of quality, this paper proposes an approach to detect triples which reflect some sort of error. In particular, erroneous triples may occur due to factual errors in the original data source, misuse of the ontology by the original data source, or errors in the integration process. Although diagnosing such errors is a difficult problem, we propose that the degree to which a triple deviates from similar triples can be an important heuristic for identifying errors. We detect such “abnormal triples” by learning probabilistic rules from the reference data and checking to what extent these rules agree with the triples. The system consists of two components for two types of abnormal relational descriptions that a Semantic Web statement could have, whether accidentally or maliciously: a statement could relate two resources that are unlikely to have anything in common or an inappropriate predicate could be used to describe the relation between the two resources. The classification technique is adopted to learn statistical characteristics for detecting a suspect resource pair, i.e. there is no significant relation between the subject and the object in the statement. For the suspect usages of a predicate, the system learns semantic patterns for each predicate from indirect semantic connections between the subject / object pairs.
Keywords :
ontologies (artificial intelligence); semantic Web; abnormal data detection; classification technique; indirect semantic connections; ontology based information integration; reference data; semantic Web data; statistical characteristics; subject-object pairs; triples detection; Context; Joining processes; Neodymium; Ontologies; Probabilistic logic; Semantic Web; Semantics; Detecting abnormal data; Ontology based information integration;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Collaboration Technologies and Systems (CTS), 2011 International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
978-1-61284-638-5
Type :
conf
DOI :
10.1109/CTS.2011.5928721
Filename :
5928721
Link To Document :
بازگشت