DocumentCode :
3544019
Title :
Analyzing Long-Term Access Locality to Find Ways to Improve Distributed Storage Systems
Author :
Miranda, Alberto ; Cortes, Toni
Author_Institution :
Barcelona Supercomput. Center, Barcelona, Spain
fYear :
2012
fDate :
15-17 Feb. 2012
Firstpage :
544
Lastpage :
553
Abstract :
An efficient design for a distributed file system originates from a deep understanding of common access patterns and user behavior which is obtained through a deep analysis of traces and snapshots. In this paper we analyze traces for eight distributed file systems that represent a mix of workloads taken from educational, research and commercial environments. We focused on characterizing block access patterns, amount of block sharing and working set size over long periods of time, and we tried to find common behaviors for all workloads that can be generalized to other storage systems. We found that most environments shared large amounts of blocks over time, and that block sharing was significantly affected by repetitive human behavior. We also found that block lifetimes tended to be short, but there were significant amounts of blocks with long lifetimes that were accessed over many consecutive days. Lastly, we determined that most daily accesses were made to a reduced set of blocks. We strongly believe that these findings can be used to improve long-term caching policies as well as data placement algorithms, thus increasing the performance of distributed storage systems.
Keywords :
cache storage; distributed databases; information retrieval; block access pattern characterization; block lifetime; block sharing; commercial environment; common access pattern; data placement algorithm; distributed file system design; distributed storage system; educational environment; long-term access locality analysis; long-term caching policy; repetitive human behavior; research environment; user behavior; Aggregates; Animation; Distributed databases; Humans; Rendering (computer graphics); Satellite broadcasting; Servers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2012 20th Euromicro International Conference on
Conference_Location :
Garching
ISSN :
1066-6192
Print_ISBN :
978-1-4673-0226-5
Type :
conf
DOI :
10.1109/PDP.2012.15
Filename :
6169634
Link To Document :
بازگشت