• DocumentCode
    680232
  • Title

    Automatic capture of provenance data in genome project workflows

  • Author

    Pinheiro, Rodrigo ; Holanda, Maristela ; Araujo, Aleteia P. F. ; Walter, Maria Emilia ; Lifschitz, Sergio

  • Author_Institution
    Comput. Sci. Dept., Univ. of Brasilia, Brasilia, Brazil
  • fYear
    2013
  • fDate
    18-21 Dec. 2013
  • Firstpage
    15
  • Lastpage
    20
  • Abstract
    Many scientific experiments are designed as computational workflows in the bioinformatics domain, which facilitates implementation and analysis. However, the amount of data generated increases at every phase of each execution, hindering the identification of the source and the data transformation. Therefore, it has become necessary to create new tools to verify automatically which resources and parameters were used to generate the results, among other information to validate and publish the experiment. This functionality of automatically capturing data provenance has been receiving attention in the scientific community, primarily with regard to bioinformatics projects, due the fact that the same workflow is executed several times with different parameters and versions of the tools. In this paper, we propose to use relational schema to automatically store data provenance using the PROV-DM model for workflows in bioinformatics projects.
  • Keywords
    bioinformatics; data analysis; genomics; PROV-DM model; automatic capture; bioinformatic domain; computational workflows; data generation; data provenance; data transformation; genome project workflows; Bioinformatics; Biological system modeling; Data models; Databases; Genomics; XML; PROV-DM; bioinformatics; data provenace; genome projects; insert; workflow;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine (BIBM), 2013 IEEE International Conference on
  • Conference_Location
    Shanghai
  • Type

    conf

  • DOI
    10.1109/BIBM.2013.6732621
  • Filename
    6732621