• DocumentCode
    625598
  • Title

    FlexIO: I/O Middleware for Location-Flexible Scientific Data Analytics

  • Author

    Fang Zheng ; Hongbo Zou ; Eisenhauer, Greg ; Schwan, Karsten ; Wolf, Michael ; Dayal, Jai ; Tuan-Anh Nguyen ; Jianting Cao ; Abbasi, Hasan ; Klasky, Scott ; Podhorszki, Norbert ; Hongfeng Yu

  • Author_Institution
    Georgia Inst. of Technol., Atlanta, GA, USA
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    320
  • Lastpage
    331
  • Abstract
    Increasingly severe I/O bottlenecks on High-End Computing machines are prompting scientists to process simulation output data online while simulations are running and before storing data on disk. There are several options to place data analytics along the I/O path: on compute nodes, on separate nodes dedicated to analytics, or after data is stored on persistent storage. Since different placements have different impact on performance and cost, there is a consequent need for flexibility in the location of data analytics. The FlexIO middleware described in this paper makes it easy for scientists to obtain such flexibility, by offering simple abstractions and diverse data movement methods to couple simulation with analytics. Various placement policies can be built on top of FlexIO to exploit the trade-offs in performing analytics at different levels of the I/O hierarchy. Experimental results demonstrate that FlexIO can support a variety of simulation and analytics workloads at large scale through flexible placement options, efficient data movement, and dynamic deployment of data manipulation functionalities.
  • Keywords
    data analysis; input-output programs; middleware; parallel machines; storage management; FlexIO middleware; I/O bottleneck; I/O hierarchy; I/O path; data storage; disk; diverse data movement method; dynamic data manipulation functionality deployment; flexible data placement policy; high end computing machine; location flexible scientific data analytics; online simulation data processing; performing analytics; persistent storage; Analytical models; Arrays; Computational modeling; Data models; Monitoring; Runtime; Software; Flexibility; I/O; In Situ Data Analytics; Placemen;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on
  • Conference_Location
    Boston, MA
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4673-6066-1
  • Type

    conf

  • DOI
    10.1109/IPDPS.2013.46
  • Filename
    6569822