DocumentCode
1984397
Title
Scalable and Resilient Workflow Executions on Production Distributed Computing Infrastructures
Author
Balderrama, Javier Rojas ; Huu, Tram Truong ; Montagnat, Johan
Author_Institution
I3S Lab., Univ. of Nice-Sophia Antipolis, Nice, France
fYear
2012
fDate
25-29 June 2012
Firstpage
119
Lastpage
126
Abstract
In spite of the growing interest for grids and cloud infrastructures among scientific communities and the availability of such facilities at large-scale, achieving high performance in production environments remains challenging due to at least four factors: the low reliability of very large-scale distributed computing infrastructures, the performance overhead induced by shared facilities, the difficulty to obtain fair balance of all user jobs in such an heterogeneous environment, and the complexity of large-scale distributed applications deployment. All together, these difficulties make infrastructure exploitation complex, and often limited to experts. This paper introduces a pragmatic solution to tackle these four issues based on a service-oriented methodology, the reuse of existing middleware services, and the joint exploitation of local and distributed computing resources. Emphasis is put on the integrated environment ease of use. Results on an actual neuroscience application show the impact of the environment setup in terms of reliability and performance. Recommendations and best practices are derived from this experiment.
Keywords
cloud computing; grid computing; middleware; natural sciences computing; service-oriented architecture; software performance evaluation; cloud infrastructures; grid infrastructures; middleware services; neuroscience application; performance overhead; pragmatic solution; production distributed computing infrastructures; resilient workflow executions; scalable workflow executions; scientific communities; service-oriented methodology; shared facilities; Diseases; Production; Reliability; Servers; Service oriented architecture; Distributed Computing Infrastructure; Grid Computing; Scientific Workflow; Service Oriented Architecture;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Computing (ISPDC), 2012 11th International Symposium on
Conference_Location
Munich/Garching, Bavaria
Print_ISBN
978-1-4673-2599-8
Type
conf
DOI
10.1109/ISPDC.2012.24
Filename
6341502
Link To Document