DocumentCode
2310565
Title
Easy and instantaneous processing for data-intensive workflows
Author
Dun, Nan ; Taura, Kenjiro ; Yonezawa, Akinori
Author_Institution
Dept. of Comput. Sci., Univ. of Tokyo, Tokyo, Japan
fYear
2010
fDate
15-15 Nov. 2010
Firstpage
1
Lastpage
10
Abstract
This paper presents a light-weight and scalable framework that enables non-privileged users to effortlessly and instantaneously describe, deploy, and execute data-intensive workflows on arbitrary computing resources from clusters, clouds, and supercomputers. This framework consists of three major components: GXP parallel/distributed shell as resource explorer and framework back-end, GMount distributed file system as underlying data sharing approach, and GXP Make as the workflow engine. With this framework, domain researchers can intuitively write workflow description in GNU make rules and harness resources from different domains with low learning and setup cost. By investigating the execution of real-world scientific applications using this framework on multi-cluster and supercomputer platforms, we demonstrate that our processing framework has practically useful performance and are suitable for common practice of data-intensive workflows in various distributed computing environments.
Keywords
cloud computing; data handling; parallel machines; peer-to-peer computing; resource allocation; GMount distributed file system; GXP distributed shell; GXP parallel shell; cloud computing; data sharing approach; data-intensive workflow engine; instantaneous processing; multicluster platform; resource explorer; supercomputer platform;
fLanguage
English
Publisher
ieee
Conference_Titel
Many-Task Computing on Grids and Supercomputers (MTAGS), 2010 IEEE Workshop on
Conference_Location
New Orleans, LA
Print_ISBN
978-1-4244-9704-1
Electronic_ISBN
978-1-4244-9705-8
Type
conf
DOI
10.1109/MTAGS.2010.5699428
Filename
5699428
Link To Document