Title :
Predicting Missing Provenance Using Semantic Associations in Reservoir Engineering
Author :
Zhao, Jing ; Gomadam, Karthik ; Prasanna, Viktor
Author_Institution :
Comput. Sci. Dept., Univ. of Southern California, Los Angeles, CA, USA
Abstract :
Provenance is becoming an important issue as a reliable estimator of data quality. However, provenance collection mechanisms in the reservoir engineering domain often result in missing provenance information. In this paper, we address the problem of predicting missing provenance information in reservoir engineering. Based on the observation that data items with specific semantic "connections" may share the same provenance, our approach annotates data items with domain entities defined in a domain ontology, and represent these "connections" as sequences of relationships (also known as semantic associations) in the ontology graph. By analyzing annotated historical datasets with complete provenance information, we capture semantic associations that may imply identical provenance. A statistical analysis is applied to assign confidence values to the discovered associations, which indicate the trust of each association when it is used for future provenance prediction. The semantic associations, along with their confidence measures, are then used by a voting algorithm to predict the missing provenance information. Our evaluation shows that the average precision of our approach is above 85% when one third of the provenance information is missing.
Keywords :
hydrocarbon reservoirs; meta data; ontologies (artificial intelligence); statistical analysis; data quality; domain ontology; missing provenance prediction; provenance collection mechanisms; reservoir engineering; semantic associations; statistical analysis; voting algorithm; Data models; Ontologies; Prediction algorithms; Predictive models; Production; Reservoirs; Semantics; Missing Provenance; Provenance; Reservoir Engineering; Semantic Associations;
Conference_Titel :
Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on
Conference_Location :
Palo Alto, CA
Print_ISBN :
978-1-4577-1648-5
Electronic_ISBN :
978-0-7695-4492-2
DOI :
10.1109/ICSC.2011.42