• DocumentCode
    3148138
  • Title

    OPQL: A First OPM-Level Query Language for Scientific Workflow Provenance

  • Author

    Lim, Chunhyeok ; Lu, Shiyong ; Chebotko, Artem ; Fotouhi, Farshad

  • Author_Institution
    Dept. of Comput. Sci., Wayne State Univ., Detroit, MI, USA
  • fYear
    2011
  • fDate
    4-9 July 2011
  • Firstpage
    136
  • Lastpage
    143
  • Abstract
    Provenance, which is one kind of metadata that captures the derivation history of a data product, including its original data sources, intermediate products, and the steps that were applied to produce it, has become increasingly important in services computing and scientific workflows to validate, interpret, and analyze the result of scientific computing. Most existing systems store provenance data captured into their own provenance storages of proprietary provenance models and conduct query processing over the physical provenance storages using query languages, such as SQL, SPARQL, and Query, which are closely coupled to the underlying provenance storage strategies. In this paper, we present OPQL, an OPM-level provenance query language, that is directly defined over the Open Provenance Model (OPM). An OPQL query takes an OPM graph as input and produces an OPM graph as output. Therefore, OPQL queries are not tightly coupled to the underlying provenance storage strategies. Our main contributions are: (i) we design OPQL, including graph patterns and an OPM-based graph algebra for OPQL, that efficiently supports provenance lineage queries, (ii) we implement OPQ Lin our OPMPROV system, where the result of OPQL queries is displayed as an OPM graph via the OPMPROV browser. An experimental study is conducted to evaluate the performance and feasibility of OPQL for provenance querying. To our best knowledge, OPQL is the first OPM-level query language for scientific workflow provenance.
  • Keywords
    graph theory; meta data; query languages; query processing; scientific information systems; storage management; workflow management software; OPM-based graph algebra; OPM-level query language; OPMProv browser; OPQL query; graph pattern; metadata; physical provenance storage; query processing; scientific workflow provenance; Algebra; Data models; Database languages; Lead; Pattern matching; Process control; XML; OPM; OPM-compliant provenance; OPQL; provenance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Services Computing (SCC), 2011 IEEE International Conference on
  • Conference_Location
    Washington, DC
  • Print_ISBN
    978-1-4577-0863-3
  • Electronic_ISBN
    978-0-7695-4462-5
  • Type

    conf

  • DOI
    10.1109/SCC.2011.60
  • Filename
    6009254