• DocumentCode
    2801976
  • Title

    Using Subfiling to Improve Programming Flexibility and Performance of Parallel Shared-file I/O

  • Author

    Gao, Kui ; Liao, Wei-keng ; Nisar, Arifa ; Choudhary, Alok ; Ross, Robert ; Latham, Robert

  • Author_Institution
    Electr. Eng. & Comput. Sci. Dept., Northwestern Univ., Evanston, IL, USA
  • fYear
    2009
  • fDate
    22-25 Sept. 2009
  • Firstpage
    470
  • Lastpage
    477
  • Abstract
    There are two popular parallel I/O programming styles used by modern scientific computational applications: unique-file and shared-file. Unique-file I/O usually gives satisfactory performance, but its major drawback is that managing a large number of files can overwhelm the task of post-simulation data processing. Shared-file I/O produces fewer files and allows arrays partitioned among processes to be saved in the canonical order. As the number of processors on modern parallel machines increases into thousands and more, the problem size and in turn the global array size also increase proportionally. It is not practical to manage files of size each larger than a few hundreds of GB. Hence, to seek a middle ground between these two I/O styles, we propose a subfiling scheme that divides a large multi-dimensional global array into smaller subarrays, each saved in a smaller file, named subfile. Subfiling is implemented on top of MPI-IO. We also incorporate it into the parallel netCDF library in order to preserve the partitioning information in the netCDF file header, so that the global array can later be reconstructed. In addition, since the subfiling scheme decreases the number of processes sharing a file, it can reduce the overhead of file system´s data consistency control. Our experimental results with several I/O benchmarks show that subfiling can provide improved I/O performance.
  • Keywords
    input-output programs; multidimensional global array; parallel IO programming styles; parallel netCDF library; parallel shared file IO; post simulation data processing; programming flexibility; shared file; subfiling scheme; unique file; Application software; Computer science; Concurrent computing; Control systems; File systems; Mathematical programming; Mathematics; Parallel processing; Parallel programming; Software libraries; MPI-IO; Parallel netCDF; subfiling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 2009. ICPP '09. International Conference on
  • Conference_Location
    Vienna
  • ISSN
    0190-3918
  • Print_ISBN
    978-1-4244-4961-3
  • Electronic_ISBN
    0190-3918
  • Type

    conf

  • DOI
    10.1109/ICPP.2009.68
  • Filename
    5362452