• DocumentCode
    2744149
  • Title

    CSV compaction to improve data-processing performance for large XML documents

  • Author

    Yoshida, Shigeru ; Yahagi, Hironori ; Odagiri, Junichi

  • Author_Institution
    Peripheral Syst. Labs., Fujitsu Labs. Ltd., Atsugi, Japan
  • fYear
    2004
  • fDate
    23-25 March 2004
  • Firstpage
    574
  • Abstract
    XML (extensible markup language) is the global standardized and flexible electronic data expression form, expected to further spread through its varied and wide use. To improve data-processing performance for XML documents in a record form, such as bills and customer lists (the data size tends to be large for business purposes), a reversible compaction method which reduces the number of apparent elements has been newly developed. The compaction is executed using XSLT, which is a format conversion of the basic function in an XML environment, such as Aapache XML software and MSXML in Microsoft Internet explorer6.0. To enable the user to easily perform the conversion, we let the user simply draw up an XML document, "compaction specification" which enumerates all elements in a record and specifies the elements to be compacted. The improvement of data-processing performance was measured using test documents and the improvement of main memory requirements and parsing time of standardized XML API software, DOM parser is also presented.
  • Keywords
    XML; data compression; document handling; CSV compaction; XML documents; XSLT; data-processing performance; electronic data expression; extensible markup language; format conversion; parsing time; reversible compaction method; test documents; Application software; Benchmark testing; Compaction; Flexible electronics; Internet; Java; Laboratories; Size measurement; Software measurement; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference, 2004. Proceedings. DCC 2004
  • ISSN
    1068-0314
  • Print_ISBN
    0-7695-2082-0
  • Type

    conf

  • DOI
    10.1109/DCC.2004.1281550
  • Filename
    1281550