DocumentCode :
569048
Title :
SMART-IO: SysteM-AwaRe Two-Level Data Organization for Efficient Scientific Analytics
Author :
Tian, Yuan ; Klasky, Scott ; Yu, Weikuan ; Abbasi, Hasan ; Wang, Bin ; Podhorszki, Norbert ; Grout, Ray ; Wolf, Matthew
fYear :
2012
fDate :
7-9 Aug. 2012
Firstpage :
181
Lastpage :
188
Abstract :
Current I/O techniques have pushed the write performance close to the system peak, but they usually overlook the read side of problem. With the mounting needs of scientific discovery, it is important to provide good read performance for many common access patterns. Such demand requires an organization scheme that can effectively utilize the underlying storage system. However, the mismatch between conventional data layout on disk and common scientific access patterns leads to significant performance degradation when a subset of data is accessed. To this end, we design a system-aware Optimized Chunking model, which aims to find an optimized organization that can strike for a good balance between data transfer efficiency and processing overhead. To enable such model for scientific applications, we propose SMART-IO, a two-level data organization framework that can organize the blocks of multidimensional data efficiently. This scheme can adapt data layouts based on data characteristics and underlying storage systems, and enable efficient scientific analytics. Our experimental results demonstrate that SMART-IO can significantly improve the read performance for challenging access patterns, and speed up data analytics. For a mission critical combustion simulation code S3D, Smart-IO achieves up to 72 times speedup for planar reads of a 3-D variable compared to the logically contiguous data layout.
Keywords :
combustion; data analysis; middleware; natural sciences computing; storage management; ADIOS; ADaptable I/O System; I/O technique; S3D mission critical combustion simulation code; SMART-IO; SysteM-AwaRe Two-Level Data Organization; data analytics; data characteristics; data subset access; data transfer efficiency; disk data layout; logically contiguous data layout; middleware; multidimensional data block organization; performance degradation; processing overhead; read performance; scientific access pattern; scientific analytics; scientific application; scientific discovery; storage system; system-aware optimized chunking model; two-level data organization framework; write performance; Adaptation models; Data models; Equations; Filling; Layout; Mathematical model; Organizations; ADIOS; Data Organization; Parallel I/O; S3D; Smart-IO;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), 2012 IEEE 20th International Symposium on
Conference_Location :
Washington, DC
ISSN :
1526-7539
Print_ISBN :
978-1-4673-2453-3
Type :
conf
DOI :
10.1109/MASCOTS.2012.30
Filename :
6298178
Link To Document :
بازگشت