DocumentCode
2963414
Title
Building Highly Available Cluster File System Based on Replication
Author
Cao, Liang ; Wang, Yu ; Xiong, Jin
Author_Institution
Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing, China
fYear
2009
fDate
8-11 Dec. 2009
Firstpage
94
Lastpage
101
Abstract
In order to gain better cost-effectiveness, current large-scale storage systems are typically built up by thousands of individual components. As systems scale up, the probability of the failure of multiple components increases. And for large-scale storage system, failures are normal rather than exception. How to build file systems providing both high throughput and highly available service under such circumstances is a big challenge. We have designed and implemented HA-DCFS3, a highly available cluster file system prototype. It uses a scalable replication algorithm called asynchronous primary copy protocol (APCP). Unlike traditional primary copy protocol that must synchronize updates to all replicas, APCP introduces an asynchronous approach where write operation is permitted to be synchronized to a subset of replicas. This flexible approach greatly improves the write performance. Furthermore, HA-DCFS3 also introduces a fine-grained failure detection called ¿ data path detection¿, which is integrated into the fault-tolerant framework based on data replication. Hence, HA-DCFS3 can provide continuous service even when component failures occur. And finally, HA-DCFS3 adopts a two-level data recovery strategy that handles transient failures with reintegration and persistent failures with re-replication respectively to reduce the cost of data repair. Our performance results show that HA-DCFS3 can deliver high and scalable aggregate performance and provide highly available service at very low cost.
Keywords
fault tolerant computing; storage management; HA-DCFS3; asynchronous primary copy protocol; cluster file system; cost-effectiveness; data path detection; fault-tolerant framework; fine-grained failure detection; large-scale storage system; scalable replication algorithm; Aggregates; Clustering algorithms; Costs; Fault detection; Fault tolerance; File systems; Large-scale systems; Protocols; Prototypes; Throughput; data availability; data replication; fault-tolerance; performance; primary copy;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Computing, Applications and Technologies, 2009 International Conference on
Conference_Location
Higashi Hiroshima
Print_ISBN
978-0-7695-3914-0
Type
conf
DOI
10.1109/PDCAT.2009.14
Filename
5372817
Link To Document