Title :
Scalable and Fault-Tolerant Cloud Computations: Modelling and Implementation
Author :
Maria Spichkova;Ian E. Thomas;Heinz W. Schmidt;Iman I. Yusuf;Daniel W. Drumm;Steve Androulakis;George Opletal;Salvy P. Russo
Author_Institution :
RMIT Univ., Melbourne, VIC, Australia
Abstract :
This paper presents a formal model for science clouds, capable of predicting and controlling resources scalably, as well as its implementation as an open source solution, called Chiminey. The feasibility of Chiminey is shown using case studies on biophysics and structural chemistry computations. Big data is acquired from scientific instruments such as synchrotrons and atomic force microscopes. The model takes into account the architecture of the overall parallel and distributed system including large-scale data sources; data sinks, for example petabyte research data stores; and cluster or cloud virtual resources and infrastructures characterised by users in simple parameters upfront. Chiminey is developed to control large numbers of processes and to provide a reliable computing and data management, which can be used by researchers without having to learn extensive infrastructure concepts and technologies.
Keywords :
"Cloud computing","Fault tolerance","Fault tolerant systems","Computational modeling","IP networks","Connectors"
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2015 IEEE 21st International Conference on
Electronic_ISBN :
1521-9097
DOI :
10.1109/ICPADS.2015.57