DocumentCode :
3122301
Title :
X-CSR: Dataflow Optimization for Distributed XML Process Pipelines
Author :
Zinn, Daniel ; Bowers, Shawn ; McPhillips, Timothy ; Ludascher, Bertram
Author_Institution :
Dept. of Comput. Sci., UC Davis, Davis, CA
fYear :
2009
fDate :
March 29 2009-April 2 2009
Firstpage :
577
Lastpage :
580
Abstract :
XML process networks are a simple, yet powerful programming paradigm for loosely coupled, coarse-grained dataflow applications such as data-centric scientific workflows. We describe a framework called Delta-XML that is well-suited for applications in which pipelines of data processors modify parts ("deltas") of XML data collections while keeping the overall collection structure intact. We show how to optimize the execution of Delta-XML process networks by minimizing the data shipping cost in distributed settings. This X-CSR optimization employs static type inference based on XML Schema to determine the XML stream fragments that are relevant to a processor, allowing irrelevant fragments to be bypassed ("shipped") to downstream pipeline steps. Finally, we present evaluation results for a real- world scientific workflow, which shows the practical feasibility of X-CSR. A long version of this paper is available as.
Keywords :
XML; pipeline processing; Delta-XML; X-CSR; coarse-grained dataflow applications; data processors; data-centric scientific workflows; dataflow optimization; distributed XML process pipelines; Corporate acquisitions; Cost function; Data engineering; Design optimization; Distributed processing; Marine vehicles; Pipelines; Process design; Production; XML; XML; actors; data intensive; dataflow; pipeline; scientific workflow; shipping optimization; streaming;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2009. ICDE '09. IEEE 25th International Conference on
Conference_Location :
Shanghai
ISSN :
1084-4627
Print_ISBN :
978-1-4244-3422-0
Electronic_ISBN :
1084-4627
Type :
conf
DOI :
10.1109/ICDE.2009.72
Filename :
4812436
Link To Document :
بازگشت