DocumentCode :
1916666
Title :
Light-Weight Data Management Solutions for Visualization and Dissemination of Massive Scientific Datasets - Position Paper
Author :
Agrawal, Gagan ; Yu Su
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear :
2012
fDate :
10-16 Nov. 2012
Firstpage :
1296
Lastpage :
1300
Abstract :
Many of the `big-data´ challenges today are arising from increasing computing ability, as data collected from simulations has become extremely valuable for a variety of scientific endeavors. With growing computational capabilities of parallel machines, scientific simulations are being performed at finer spatial and temporal scales, leading to a data explosion. As a specific example, the Global Cloud-Resolving Model (GCRM) currently has a grid-cell size of 4 km, and already produces 1 petabyte of data for a 10 day simulation. Future plans include simulations with a grid-cell size of 1 km, which will increase the data generation 64 folds. Finer granularity of simulation data offers both an opportunity and a challenge. On one hand, it can allow understanding of underlying phenomenon and features in a way that would not be possible with coarser granularity. On the other hand, larger datasets are extremely difficult to store, manage, disseminate, analyze, and visualize. Neither the memory capacity of parallel machines, memory access speeds, nor disk bandwidths are increasing at the same rate as computing power, contributing to the difficulty in storing, managing, and analyzing these datasets. Simulation data is often disseminated widely, through portals like the Earth System Grid (ESG), and downloaded by researchers all over the world. Such dissemination efforts are hampered by dataset size growth, as wide area data transfer bandwidths are growing at a much slower pace. Finally, while visualizing datasets, human perception is inherently limited.
Keywords :
cloud computing; data analysis; data visualisation; Earth system grid; GCRM; data analysis; data dissemination; data explosion; data generation; data management; data management solution; data storage; data visualization; global cloud-resolving model; massive scientific dataset; parallel machine; simulation data granularity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:
Conference_Location :
Salt Lake City, UT
Print_ISBN :
978-1-4673-6218-4
Type :
conf
DOI :
10.1109/SC.Companion.2012.157
Filename :
6495939
Link To Document :
بازگشت