Title :
Parallel Post-Processing with MPI-Bash
Author_Institution :
Comput., Comput., & Stat. Sci. Div., Los Alamos Nat. Lab., Los Alamos, NM, USA
Abstract :
Parallel, scientific applications running on massively parallel supercomputers commonly produce large numbers of large data files. While parallel filesystems improve the performance of file generation, post-processing activities such as archiving and compressing the data or performing routine format transformations are often run sequentially (and therefore slowly), squandering the supercomputer´s vast performance. Consequently, data that take hours to generate may take days to post-process. Because post-processing often consists of running a relatively small set of shell commands on a relatively large number of files and because most parallel-application developers are comfortable with MPI we propose turning the shell itself into an MPI program and exposing common MPI functions directly to user-written shell scripts. Our implementation, MPI-Bash, has been used to date to speed up the compression, archiving, and transfer of large files but can conceivably be applied to numerous other purposes.
Keywords :
data compression; file organisation; message passing; parallel processing; MPI functions; MPI program; MPI-Bash; file generation; large data files; large files archiving; large files compression; large files transfer; message passing interface; parallel filesystems; parallel post-processing activities; parallel scientific applications; parallel supercomputers; shell commands; user-written shell scripts; Conferences; MATLAB; Mathematical model; Parallel processing; Registers; Standards; Synchronization;
Conference_Titel :
HPC User Support Tools (HUST), 2014 First International Workshop on
DOI :
10.1109/HUST.2014.9