• DocumentCode
    2963414
  • Title

    Building Highly Available Cluster File System Based on Replication

  • Author

    Cao, Liang ; Wang, Yu ; Xiong, Jin

  • Author_Institution
    Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing, China
  • fYear
    2009
  • fDate
    8-11 Dec. 2009
  • Firstpage
    94
  • Lastpage
    101
  • Abstract
    In order to gain better cost-effectiveness, current large-scale storage systems are typically built up by thousands of individual components. As systems scale up, the probability of the failure of multiple components increases. And for large-scale storage system, failures are normal rather than exception. How to build file systems providing both high throughput and highly available service under such circumstances is a big challenge. We have designed and implemented HA-DCFS3, a highly available cluster file system prototype. It uses a scalable replication algorithm called asynchronous primary copy protocol (APCP). Unlike traditional primary copy protocol that must synchronize updates to all replicas, APCP introduces an asynchronous approach where write operation is permitted to be synchronized to a subset of replicas. This flexible approach greatly improves the write performance. Furthermore, HA-DCFS3 also introduces a fine-grained failure detection called ¿ data path detection¿, which is integrated into the fault-tolerant framework based on data replication. Hence, HA-DCFS3 can provide continuous service even when component failures occur. And finally, HA-DCFS3 adopts a two-level data recovery strategy that handles transient failures with reintegration and persistent failures with re-replication respectively to reduce the cost of data repair. Our performance results show that HA-DCFS3 can deliver high and scalable aggregate performance and provide highly available service at very low cost.
  • Keywords
    fault tolerant computing; storage management; HA-DCFS3; asynchronous primary copy protocol; cluster file system; cost-effectiveness; data path detection; fault-tolerant framework; fine-grained failure detection; large-scale storage system; scalable replication algorithm; Aggregates; Clustering algorithms; Costs; Fault detection; Fault tolerance; File systems; Large-scale systems; Protocols; Prototypes; Throughput; data availability; data replication; fault-tolerance; performance; primary copy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Computing, Applications and Technologies, 2009 International Conference on
  • Conference_Location
    Higashi Hiroshima
  • Print_ISBN
    978-0-7695-3914-0
  • Type

    conf

  • DOI
    10.1109/PDCAT.2009.14
  • Filename
    5372817