DocumentCode
2962447
Title
Key Elements Tracing Method for Parallel XML Parsing in Multi-Core System
Author
Li, Xiaosong ; Wang, Hao ; Liu, Taoying ; Li, Wei
Author_Institution
Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing, China
fYear
2009
fDate
8-11 Dec. 2009
Firstpage
439
Lastpage
444
Abstract
Though XML is applied intensively in a lot of applications, XML parsing is not practical in many fields because of its poor performance. Parallel XML parsing on multi-core is a promising choice. Previous methods all adopt data parallel approach on XML parsing. As the semi-structured nature of XML, they were obliged to divide the data into well-formed XML chunks and then parse these chunks parallel. The division process is named as preparsing. As the preparsing is serial, it becomes the bottleneck of parallel XML parsing. Related work Simultaneous Finite Transducer (SFTXP) parallelized the preparsing stage. It maintained multiple preparser results for each equal sized chunk according to enumerated all possible parsing states. In spite of finite states for each XML, the overhead by SFTXP is tremendous, including CPU time and memory for multiple results generating and storing, respectively. In this work, we address parallel XML parsing by Key Element Parse Tracing (KEPT) method which parallelizes the preparsing and parsing at element level. It remolds the preparsing as a key element extracting process and schedules the processing of key elements in the framework of KEPT. Then parsing process is parallelized as a whole. To demonstrate the effectiveness, we implement it on libxml2 and obtain good scalability on both an 8-core Linux machine and an 8-core 24 SMT Sun machine running Solaris.
Keywords
XML; program compilers; 8-core 24 SMT Sun machine; 8-core Linux machine; SFTXP; key elements parse tracing method; libxml2; multi-core system; parallel XML parsing; Computers; Distributed computing; Linux; Parallel processing; Scalability; Skeleton; Surface-mount technology; Throughput; Transducers; XML; XML parsing; key element tracing; multi-core; parallel;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Computing, Applications and Technologies, 2009 International Conference on
Conference_Location
Higashi Hiroshima
Print_ISBN
978-0-7695-3914-0
Type
conf
DOI
10.1109/PDCAT.2009.64
Filename
5372764
Link To Document