DocumentCode :
2093127
Title :
A Data Parallel Approach to XML Parsing and Query
Author :
You, Cheng-Han ; Wang, Sheng-De
Author_Institution :
Dept. of Electr. Eng., Nat. Taiwan Univ., Taipei, Taiwan
fYear :
2011
fDate :
2-4 Sept. 2011
Firstpage :
520
Lastpage :
527
Abstract :
Data-parallel XML parsing has a crucial problem in partitioning XML documents. Existing approaches need a pre-parse step to determine the partitions. In this paper, we propose a direct parallel method to solve this problem without pre-parsing. In the direct parallel method, we directly start the parallel parsing by finding the "light tower", which is a particular character with some exceptions, called clues. We handle the exceptions by watching the clues and reparsing the partition if it is required in the parsing stage. We also propose a non-synchronized splitter approach to the parallel XML querying using XPath expressions. In the non-synchronized splitter approach, we split an XPath expression into pieces to be executed by threads and we use a data structure, called the ancestor table, to help each thread handle its part of XPath expression independently without communications between threads. Our experiments show that our approach scales well from small sized files to huge sized files.
Keywords :
XML; data structures; document handling; grammars; parallel processing; XML document partitioning; XPath expression; ancestor table; data parallel approach; data structure; direct parallel method; exception handling; nonsynchronized splitter approach; parallel XML parsing; parallel XML querying; Data structures; Indexes; Instruction sets; Marine animals; Message systems; Poles and towers; XML; VTD-XML; XML parsing; XML querying; data parallel; multi-core;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Communications (HPCC), 2011 IEEE 13th International Conference on
Conference_Location :
Banff, AB
Print_ISBN :
978-1-4577-1564-8
Electronic_ISBN :
978-0-7695-4538-7
Type :
conf
DOI :
10.1109/HPCC.2011.74
Filename :
6063034
Link To Document :
بازگشت