• DocumentCode
    2028395
  • Title

    A workflow-enabled big data analytics software stack for escience

  • Author

    Palazzo, C. ; Mariello, A. ; Fiore, S. ; D´Anca, A. ; Elia, D. ; Williams, D.N. ; Aloisio, G.

  • Author_Institution
    Centro Euro-Mediterraneo sui Cambiamenti Climatici, Lecce, Italy
  • fYear
    2015
  • fDate
    20-24 July 2015
  • Firstpage
    545
  • Lastpage
    552
  • Abstract
    The availability of systems able to process and analyse big amount of data has boosted scientific advances in several fields. Workflows provide an effective tool to define and manage large sets of processing tasks. In the big data analytics area, the Ophidia project provides a cross-domain big data analytics framework for the analysis of scientific, multi-dimensional datasets. The framework exploits a server-side, declarative, parallel approach for data analysis and mining. It also features a complete workflow management system to support the execution of complex scientific data analysis, schedule tasks submission, manage operators dependencies and monitor jobs execution. The workflow management engine allows users to perform a coordinated execution of multiple data analytics operators (both single and massive - parameter sweep) in an effective manner. For the definition of the big data analytics workflow, a JSON schema has been properly designed and implemented. To aid the definition of the workflows, a visual design language consisting of several symbols, named Data Analytics Workflow Modelling Language (DAWML), has been also defined.
  • Keywords
    Big Data; data analysis; data mining; scientific information systems; specification languages; workflow management software; DAWML; JSON schema; Ophidia project; complex scientific data analysis; cross-domain big data analytics framework; data analytics workflow modelling language; data mining; declarative approach; eScience; multidimensional datasets; multiple data analytics operators; parallel approach; scientific advances; server-side approach; visual design language; workflow management system; workflow-enabled big data analytics software stack; Analytical models; Big data; Data analysis; Data models; Engines; Runtime; Servers; big data analytics; scientific workflows; workflow engine; workflow modelling language;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing & Simulation (HPCS), 2015 International Conference on
  • Conference_Location
    Amsterdam
  • Print_ISBN
    978-1-4673-7812-3
  • Type

    conf

  • DOI
    10.1109/HPCSim.2015.7237088
  • Filename
    7237088