DocumentCode
625602
Title
Efficient and Scalable Retrieval Techniques for Global File Properties
Author
Ahn, Dong H. ; Brim, Michael J. ; de Supinski, Bronis R. ; Gamblin, Todd ; Lee, Gregory L. ; LeGendre, Matthew P. ; Miller, Barton P. ; Schulz, Markus
Author_Institution
Comput. Directorate, Lawrence Livermore Nat. Lab., Livermore, CA, USA
fYear
2013
fDate
20-24 May 2013
Firstpage
369
Lastpage
380
Abstract
Large-scale systems typically mount many different file systems with distinct performance characteristics and capacity. Applications must efficiently use this storage in order to realize their full performance potential. Users must take into account potential file replication throughout the storage hierarchy as well as contention in lower levels of the I/O system, and must consider communicating the results of file I/O between application processes to reduce file system accesses. Addressing these issues and optimizing file accesses requires detailed runtime knowledge of file system performance characteristics and the location(s) of files on them. In this paper, we propose Fast Global File Status (FGFS), a scalable mechanism to retrieve file information, such as its degree of distribution or replication and consistency. We use a novel node-local technique that turns expensive, non-scalable file system calls into simple string comparison operations. FGFS raises the namespace of a locally-defined file path to a global namespace with little or no file system calls to obtain global file properties efficiently. Our evaluation on a large multi-physics application shows that most FGFS file status queries on its executable and 848 shared library files complete in 272 milliseconds or faster at 32,768 MPI processes. Even the most expensive operation, which checks global file consistency, completes in under 7 seconds at this scale, an improvement of several orders of magnitude over the traditional checksum technique.
Keywords
application program interfaces; file organisation; large-scale systems; message passing; query processing; shared memory systems; FGFS; FGFS file status queries; MPI processes; executable files; file I/O system; file consistency; file distribution; file information retrieval; file replication; file system access; file system location; file system performance characteristics; global file properties; global namespace; large-scale systems; locally-defined file path; multiphysics application; node-local technique; scalable fast global file status mechanism; shared library files; storage hierarchy; Complexity theory; Libraries; Load modeling; Scalability; Servers; Software; Storms;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on
Conference_Location
Boston, MA
ISSN
1530-2075
Print_ISBN
978-1-4673-6066-1
Type
conf
DOI
10.1109/IPDPS.2013.49
Filename
6569826
Link To Document