DocumentCode
3421658
Title
Evaluating the role of context in syntax directed compression of XML documents
Author
Hariharan, S. ; Shankar, Priti
Author_Institution
Dept. of Comput. Sci. & Autom., Indian Inst. of Sci., Bangalore
fYear
2006
fDate
28-30 March 2006
Lastpage
453
Abstract
Summary form only given. This paper proposes a new technique for tracking context to be used in a statistical code compression scheme for XML documents. Based on recursive finite state machines, the techniques employs an arithmetic coding scheme. The tradeoffs between space and compression ratio is studied by observing the effects of either using or ignoring root to leaf contexts for textual content in the associated tree structures. The scheme is syntax aware and the compressor and decompressor can be generated automatically from the document type definition (DTD) without interactive inputs from the user. A comparison of the path sensitive and path agnostic schemes for storing context for PCDATA was performed. Experimental results show that path sensitive schemes are less effective in the fixed memory model
Keywords
XML; arithmetic codes; computational linguistics; data compression; finite state machines; statistical analysis; tree codes; XML documents; arithmetic coding scheme; document type definition; path sensitive schemes; recursive finite state machines; statistical code compression scheme; syntax directed compression; tree structures; Arithmetic; Automata; Automation; Computer science; Data compression; Decoding; Mirrors; Size measurement; Tree data structures; XML;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Compression Conference, 2006. DCC 2006. Proceedings
Conference_Location
Snowbird, UT
ISSN
1068-0314
Print_ISBN
0-7695-2545-8
Type
conf
DOI
10.1109/DCC.2006.34
Filename
1607296
Link To Document