• DocumentCode
    74467
  • Title

    SANE: Semantic-Aware Namespacein Ultra-Large-Scale File Systems

  • Author

    Yu Hua ; Hong Jiang ; Yifeng Zhu ; Dan Feng ; Lei Xu

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Huazhong Univ. of Sci. & Technol., Wuhan, China
  • Volume
    25
  • Issue
    5
  • fYear
    2014
  • fDate
    May-14
  • Firstpage
    1328
  • Lastpage
    1338
  • Abstract
    The explosive growth in data volume and complexity imposes great challenges for file systems. To address these challenges, an innovative namespace management scheme is in desperate need to provide both the ease and efficiency of data access. In almost all today´s file systems, the namespace management is based on hierarchical directory trees. This tree-based namespace scheme is prone to severe performance bottlenecks and often fails to provide real-time response to complex data lookups. This paper proposes a Semantic-Aware Namespace scheme, called SANE, which provides dynamic and adaptive namespace management for ultra-large storage systems with billions of files. SANE introduces a new naming methodology based on the notion of semantic-aware per-file namespace, which exploits semantic correlations among files, to dynamically aggregate correlated files into small, flat but readily manageable groups to achieve fast and accurate lookups. SANE is implemented as a middleware in conventional file systems and works orthogonally with hierarchical directory trees. The semantic correlations and file groups identified in SANE can also be used to facilitate file prefetching and data de-duplication, among other system-level optimizations. Extensive trace-driven experiments on our prototype implementation validate the efficacy and efficiency of SANE.
  • Keywords
    middleware; storage management; tree data structures; SANE; complex data lookups; data access; data complexity; data de-duplication; data volume; dynamic file aggregation; dynamic-adaptive namespace management; file groups; file prefetching; hierarchical directory trees; middleware; semantic correlations; semantic-aware namespace scheme; semantic-aware per-file namespace; system-level optimizations; tree-based namespace scheme; ultralarge storage systems; ultralarge-scale file systems; Complexity theory; Correlation; File systems; Indexes; Prefetching; Semantics; System performance; File systems; namespace management; semantic awareness; storage systems;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2013.140
  • Filename
    6519233