Title :
DRepl: Optimizing access to application data for analysis and visualization
Author :
Ionkov, Latchesar ; Lang, Michael ; Maltzahn, Carlos
Abstract :
Until recently most scientific applications produced data that is saved, analyzed and visualized at later time. In recent years, with the large increase in the amount of data and computational power available there is demand for applications to support data access in-situ, or close-to simulation to provide application steering, analytics and visualization. Data access patterns required for these activities are usually different than the data layout produced by the application. In most of the large HPC clusters scientific data is stored in parallel file systems instead of locally on the cluster nodes. To increase reliability, the data is replicated, using standard RAID schemes. Parallel file server nodes usually have more processing power than they need, so it is feasible to off-load some of the data intensive processing to them. DRepl replaces the standard methods of data replication with replicas having different layouts, optimized for the most commonly used access patterns. Replicas can be complete (i.e. any other replica can be reconstructed from it), or incomplete. DRepl consists of a language to describe the dataset and the necessary data layouts and tools to create a user-space file server that provides and keeps the data consistent and up to date in all optimized layouts. DRepl decouples the data producers and consumers and the data layouts they use from the way the data is stored on the storage system. DRepl has shown up to 2x for cumulative performance when data is accessed using optimized replicas.
Keywords :
RAID; file servers; parallel processing; storage allocation; DRepl; HPC clusters; RAID schemes; application steering; cluster nodes; data access patterns; data layout; data replication; data storage; parallel file server nodes; parallel file systems; user-space file server; Arrays; Data models; Data visualization; Engines; Layout; Reactive power; Servers; DISC; data replication; data storage; exascale; fault tolerance;
Conference_Titel :
Mass Storage Systems and Technologies (MSST), 2013 IEEE 29th Symposium on
Conference_Location :
Long Beach, CA
Print_ISBN :
978-1-4799-0217-0
DOI :
10.1109/MSST.2013.6558439