Title :
Abstract: Auto-Tuning of Parallel IO Parameters for HDF5 Applications
Author :
Behzad, Babak ; Huchette, Joseph ; Huong Luu ; Aydt, Ruth ; Koziol, Quincey ; Prabhat, Mr ; Byna, Surendra ; Chaarawi, Mohamad ; Yushu Yao
Abstract :
Parallel I/O is an unavoidable part of modern high-performance computing (HPC), but its system-wide dependencies means it has eluded optimization across platforms and applications. This can introduce bottlenecks in otherwise computationally efficient code, especially as scientific computing becomes increasingly data-driven. Various studies have shown that dramatic improvements are possible when the parameters are set appropriately. However, as a result of having multiple layers in the HPC I/O stack - each with its own optimization parameters-and nontrivial execution time for a test run, finding the optimal parameter values is a very complex problem. Additionally, optimal sets do not necessarily translate between use cases, since tuning I/O performance can be highly dependent on the individual application, the problem size, and the compute platform being used. Tunable parameters are exposed primarily at three levels in the I/O stack: the system, middleware, and high-level data-organization layers. HPC systems need a parallel file system, such as Lustre, to intelligently store data in a parallelized fashion. Middleware communication layers, such as MPI-IO, support this kind of parallel I/O and offer a variety of optimizations, such as collective buffering. Scientists and application developers often use HDF5, a high-level cross-platform I/O library that offers a hierarchical object-database representation of scientific data.
Keywords :
file organisation; input-output programs; middleware; natural sciences computing; parallel processing; HDF5 applications; HPC I/O stack; HPC systems; I/O performance tuning; Lustre; MPI-IO; collective buffering; computationally efficient code; data driven scientific computing; execution time; hierarchical object-database representation; high level data organization layers; high-level cross-platform I/O library; middleware communication layers; modern high-performance computing; optimal parameter values; optimal sets; optimization across platforms; optimization parameters; parallel IO parameter auto-tuning; parallel file system; system wide dependencies; tunable parameters;
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:
Conference_Location :
Salt Lake City, UT
Print_ISBN :
978-1-4673-6218-4
DOI :
10.1109/SC.Companion.2012.236