• DocumentCode
    625602
  • Title

    Efficient and Scalable Retrieval Techniques for Global File Properties

  • Author

    Ahn, Dong H. ; Brim, Michael J. ; de Supinski, Bronis R. ; Gamblin, Todd ; Lee, Gregory L. ; LeGendre, Matthew P. ; Miller, Barton P. ; Schulz, Markus

  • Author_Institution
    Comput. Directorate, Lawrence Livermore Nat. Lab., Livermore, CA, USA
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    369
  • Lastpage
    380
  • Abstract
    Large-scale systems typically mount many different file systems with distinct performance characteristics and capacity. Applications must efficiently use this storage in order to realize their full performance potential. Users must take into account potential file replication throughout the storage hierarchy as well as contention in lower levels of the I/O system, and must consider communicating the results of file I/O between application processes to reduce file system accesses. Addressing these issues and optimizing file accesses requires detailed runtime knowledge of file system performance characteristics and the location(s) of files on them. In this paper, we propose Fast Global File Status (FGFS), a scalable mechanism to retrieve file information, such as its degree of distribution or replication and consistency. We use a novel node-local technique that turns expensive, non-scalable file system calls into simple string comparison operations. FGFS raises the namespace of a locally-defined file path to a global namespace with little or no file system calls to obtain global file properties efficiently. Our evaluation on a large multi-physics application shows that most FGFS file status queries on its executable and 848 shared library files complete in 272 milliseconds or faster at 32,768 MPI processes. Even the most expensive operation, which checks global file consistency, completes in under 7 seconds at this scale, an improvement of several orders of magnitude over the traditional checksum technique.
  • Keywords
    application program interfaces; file organisation; large-scale systems; message passing; query processing; shared memory systems; FGFS; FGFS file status queries; MPI processes; executable files; file I/O system; file consistency; file distribution; file information retrieval; file replication; file system access; file system location; file system performance characteristics; global file properties; global namespace; large-scale systems; locally-defined file path; multiphysics application; node-local technique; scalable fast global file status mechanism; shared library files; storage hierarchy; Complexity theory; Libraries; Load modeling; Scalability; Servers; Software; Storms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on
  • Conference_Location
    Boston, MA
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4673-6066-1
  • Type

    conf

  • DOI
    10.1109/IPDPS.2013.49
  • Filename
    6569826