Title :
SANE: Semantic-Aware Namespacein Ultra-Large-Scale File Systems
Author :
Yu Hua ; Hong Jiang ; Yifeng Zhu ; Dan Feng ; Lei Xu
Author_Institution :
Sch. of Comput. Sci. & Technol., Huazhong Univ. of Sci. & Technol., Wuhan, China
Abstract :
The explosive growth in data volume and complexity imposes great challenges for file systems. To address these challenges, an innovative namespace management scheme is in desperate need to provide both the ease and efficiency of data access. In almost all today´s file systems, the namespace management is based on hierarchical directory trees. This tree-based namespace scheme is prone to severe performance bottlenecks and often fails to provide real-time response to complex data lookups. This paper proposes a Semantic-Aware Namespace scheme, called SANE, which provides dynamic and adaptive namespace management for ultra-large storage systems with billions of files. SANE introduces a new naming methodology based on the notion of semantic-aware per-file namespace, which exploits semantic correlations among files, to dynamically aggregate correlated files into small, flat but readily manageable groups to achieve fast and accurate lookups. SANE is implemented as a middleware in conventional file systems and works orthogonally with hierarchical directory trees. The semantic correlations and file groups identified in SANE can also be used to facilitate file prefetching and data de-duplication, among other system-level optimizations. Extensive trace-driven experiments on our prototype implementation validate the efficacy and efficiency of SANE.
Keywords :
middleware; storage management; tree data structures; SANE; complex data lookups; data access; data complexity; data de-duplication; data volume; dynamic file aggregation; dynamic-adaptive namespace management; file groups; file prefetching; hierarchical directory trees; middleware; semantic correlations; semantic-aware namespace scheme; semantic-aware per-file namespace; system-level optimizations; tree-based namespace scheme; ultralarge storage systems; ultralarge-scale file systems; Complexity theory; Correlation; File systems; Indexes; Prefetching; Semantics; System performance; File systems; namespace management; semantic awareness; storage systems;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
DOI :
10.1109/TPDS.2013.140