• DocumentCode
    659576
  • Title

    SciFlow: A dataflow-driven model architecture for scientific computing using Hadoop

  • Author

    Pengfei Xuan ; Yueli Zheng ; Sarupria, Sapna ; Apon, Amy

  • Author_Institution
    Sch. of Comput., Clemson Univ., Clemson, SC, USA
  • fYear
    2013
  • fDate
    6-9 Oct. 2013
  • Firstpage
    36
  • Lastpage
    44
  • Abstract
    Many computational science applications utilize complex workflow patterns that generate an intricately connected set of output files for subsequent analysis. Some types of applications, such as rare event sampling, additionally require guaranteed completion of all subtasks for analysis, and place significant demands on the workflow management and execution environment. SciFlow is a user interface built over the Hadoop infrastructure that provides a framework to support the complex process and data interactions and guaranteed completion requirements of scientific workflows. It provides an efficient mechanism for building a parallel scientific application with dataflow patterns, and enables the design, deployment, and execution of data intensive, many-task computing tasks on a Hadoop platform. The design principles of this framework emphasize simplicity, scalability and fault-tolerance. A case study using the forward flux sampling rare event simulation application validates the functionality, reliability and effectiveness of the framework.
  • Keywords
    Big Data; data flow computing; public domain software; scientific information systems; software architecture; Hadoop infrastructure; SciFlow; data intensive computing tasks; data interactions; dataflow patterns; dataflow-driven model architecture; fault-tolerance; forward flux sampling rare event simulation; many-task computing; parallel scientific application; scientific computing; scientific workflows; user interface; Computational modeling; Computer architecture; Data models; Data processing; Scientific computing; Standards; Trajectory; Big Data; Hadoop; dataflow; dataflow-driven design patterns; forward flux sampling rare events simulation; many-task computing; scientific computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data, 2013 IEEE International Conference on
  • Conference_Location
    Silicon Valley, CA
  • Type

    conf

  • DOI
    10.1109/BigData.2013.6691725
  • Filename
    6691725