• DocumentCode
    3421658
  • Title

    Evaluating the role of context in syntax directed compression of XML documents

  • Author

    Hariharan, S. ; Shankar, Priti

  • Author_Institution
    Dept. of Comput. Sci. & Autom., Indian Inst. of Sci., Bangalore
  • fYear
    2006
  • fDate
    28-30 March 2006
  • Lastpage
    453
  • Abstract
    Summary form only given. This paper proposes a new technique for tracking context to be used in a statistical code compression scheme for XML documents. Based on recursive finite state machines, the techniques employs an arithmetic coding scheme. The tradeoffs between space and compression ratio is studied by observing the effects of either using or ignoring root to leaf contexts for textual content in the associated tree structures. The scheme is syntax aware and the compressor and decompressor can be generated automatically from the document type definition (DTD) without interactive inputs from the user. A comparison of the path sensitive and path agnostic schemes for storing context for PCDATA was performed. Experimental results show that path sensitive schemes are less effective in the fixed memory model
  • Keywords
    XML; arithmetic codes; computational linguistics; data compression; finite state machines; statistical analysis; tree codes; XML documents; arithmetic coding scheme; document type definition; path sensitive schemes; recursive finite state machines; statistical code compression scheme; syntax directed compression; tree structures; Arithmetic; Automata; Automation; Computer science; Data compression; Decoding; Mirrors; Size measurement; Tree data structures; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference, 2006. DCC 2006. Proceedings
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Print_ISBN
    0-7695-2545-8
  • Type

    conf

  • DOI
    10.1109/DCC.2006.34
  • Filename
    1607296