• DocumentCode
    168640
  • Title

    HyCache+: Towards Scalable High-Performance Caching Middleware for Parallel File Systems

  • Author

    Dongfang Zhao ; Kan Qiao ; Raicu, Ioan

  • Author_Institution
    Illinois Inst. of Technol., Chicago, IL, USA
  • fYear
    2014
  • fDate
    26-29 May 2014
  • Firstpage
    267
  • Lastpage
    276
  • Abstract
    The ever-growing gap between the computation and I/O is one of the fundamental challenges for future computing systems. This computation-I/O gap is even larger for modern large scale high-performance systems due to their state-of-the-art yet decades long architecture: the compute and storage resources form two cliques that are interconnected with shared networking infrastructure. This paper presents a distributed storage middleware, called HyCache+, right on the compute nodes, which allows I/O to effectively leverage the high bi-section bandwidth of the high-speed interconnect of massively parallel high-end computing systems. HyCache+ provides the POSIX interface to end users with the memory-class I/O throughput and latency, and transparently swap the cached data with the existing slow speed but high-capacity networked attached storage. HyCache+ has the potential to achieve both high performance and low cost large capacity, the best of both worlds. To further improve the caching performance from the perspective of the global storage system, we propose a 2-phase mechanism to cache the hot data for parallel applications, called 2-Layer Scheduling (2LS), which minimizes the file size to be transferred between compute nodes and heuristically replaces files in the cache. We deploy HyCache+ on the IBM Blue Gene/P supercomputer, and observe two orders of magnitude faster I/O throughput than the default GPFS parallel file system. Furthermore, the proposed heuristic caching approach shows 29X speedup over the traditional LRU algorithm.
  • Keywords
    Unix; cache storage; distributed databases; input-output programs; middleware; parallel machines; processor scheduling; storage management; 2-phase mechanism; 2LS; 2Layer scheduling; GPFS parallel file system; HyCache+; IBM blue gene/P supercomputer; LRU algorithm; POSIX interface; bisection bandwidth; cached data; caching performance; computation-I/O gap; distributed storage middleware; global storage system; heuristic caching approach; high-capacity networked attached storage; high-performance caching middleware; high-performance system; high-speed interconnect; memory-class I/O throughput; parallel file systems; parallel high-end computing systems; shared networking infrastructure; storage resource; Bandwidth; Distributed databases; Encoding; Middleware; Protocols; Servers; Throughput; distributed caching; heterogeneous storage; parallel and distributed file systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on
  • Conference_Location
    Chicago, IL
  • Type

    conf

  • DOI
    10.1109/CCGrid.2014.11
  • Filename
    6846462