Title :
A scalable messaging system for accelerating discovery from large scale scientific simulations
Author :
Tong Jin ; Fan Zhang ; Parashar, Manish ; Klasky, Scott ; Podhorszki, Norbert ; Abbasi, Hasan
Author_Institution :
NSF Center for Cloud & Autonomic Comput., Rutgers Univ., Piscataway, NJ, USA
Abstract :
Emerging scientific and engineering simulations running at scale on leadership-class High End Computing (HEC) environments are producing large volumes of data, which has to be transported and analyzed before any insights can result from these simulations. The complexity and cost (in terms of time and energy) associated with managing and analyzing this data have become significant challenges, and are limiting the impact of these simulations. Recently, data-staging approaches along with in-situ and in-transit analytics have been proposed to address these challenges by offloading I/O and/or moving data processing closer to the data. However, scientists continue to be overwhelmed by the large data volumes and data rates. In this paper we address this latter challenge. Specifically, we propose a highly scalable and low-overhead associative messaging framework that runs on the data staging resources within the HEC platform, and builds on the staging-based online in-situ/in-transit analytics to provide publish/subscribe/notification-type messaging patterns to the scientist. Rather than having to ingest and inspect the data volumes, this messaging system allows scientists to (1) dynamically subscribe to data events of interest, e.g., simple data values or a complex function or simple reduction (max()/min()/avg()) of the data values in a certain region of the application domain is greater/less than a threshold value, or certain spatial/temporal data features or data patterns are detected; (2) define customized in-situ/in-transit actions that are triggered based on the events, such as data visualization or transformation; and (3) get notified when these events occur. The key contribution of this paper is a design and implementation that can support such a messaging abstraction at scale on high-end computing (HEC) systems with minimal overheads. We have implemented and deployed the messaging system on the Jaguar Cray XK6 machines at Oak Ridge National Laboratory and the Lones- ar system at the Texas Advanced Computing Center (TACC), and we present the experimental performance evaluation using these HEC platforms in the paper.
Keywords :
Cray computers; data visualisation; database management systems; electronic messaging; middleware; HEC environment; HEC system; Jaguar Cray XK6 machine; Lonestar system; Oak Ridge National Laboratory; TACC; Texas Advanced Computing Center; application domain; data analysis; data event; data management; data pattern; data rate; data staging resource; data transformation; data value; data visualization; data-staging approach; in-situ analytics; in-transit analytics; large data volume; large scale scientific simulation; leadership-class high end computing; low-overhead associative messaging framework; messaging abstraction; publish/subscribe/notification-type messaging pattern; scalable associative messaging framework; scalable messaging system; scientific and engineering simulation; spatial data; staging-based online; temporal data; threshold value; associative messaging system; data staging; in-situ/in-transit analytics; publish/subscribe;
Conference_Titel :
High Performance Computing (HiPC), 2012 19th International Conference on
Conference_Location :
Pune
Print_ISBN :
978-1-4673-2372-7
Electronic_ISBN :
978-1-4673-2370-3
DOI :
10.1109/HiPC.2012.6507512