Title :
Performance and Availability Tradeoffs in Replicated File Systems
Author :
Zhang, Jiaying ; Honeyman, Peter
Author_Institution :
Google, Inc, Santa Monica, CA
Abstract :
Replication is a key technique for improving fault tolerance but can introduce considerable performance overhead under some circumstances. To explore the tradeoff between performance and failure resilience, we develop a calculus that takes into consideration the I/O characteristics of applications and failure behavior of distributed storage nodes. With the developed evaluation model, we then prescribe a file system replication strategy that maximizes the utilization of computational resources for long-running and compute-intensive grid applications.
Keywords :
fault tolerant computing; grid computing; distributed storage nodes; failure resilience; fault tolerance; grid computing; performance overhead; replicated file systems; Availability; Calculus; Computer networks; Delay; Fault tolerance; File servers; File systems; Grid computing; Network servers; Resilience; Grid; Replication; availability; performance; tradeoff;
Conference_Titel :
Cluster Computing and the Grid, 2008. CCGRID '08. 8th IEEE International Symposium on
Conference_Location :
Lyon
Print_ISBN :
978-0-7695-3156-4
Electronic_ISBN :
978-0-7695-3156-4
DOI :
10.1109/CCGRID.2008.80