• DocumentCode
    1871953
  • Title

    Speculative p-DFAs for parallel XML parsing

  • Author

    Zhang, Ying ; Pan, Yinfei ; Chiu, Kenneth

  • Author_Institution
    Dept. of Comput. Sci., State Univ. of New York -Binghamton, Binghamton, NY, USA
  • fYear
    2009
  • fDate
    16-19 Dec. 2009
  • Firstpage
    388
  • Lastpage
    397
  • Abstract
    XML has seen wide acceptance in a number of application domains, and contributed to the success of wide-scale grid and scientific computing environments. Performance, however, is still an issue, and limits adoption under some situations where it might otherwise be able to provide significant interoperability, flexibility, and extensibility. As CPUs increasingly have multiple cores, parallel XML parsing can help to address this concern. This paper explores the use of speculation to improve the performance of parallel XML parsing. Building on previous work, we use an initial preparsing stage to build a sketch of the document which we called the skeleton. This skeleton contains enough information so that we can then proceed to do the full parse in parallel using unmodified libxml2. The preparsing itself is parallelized using product machines which we call p-DFAs. During execution, unlikely possibilities are discarded in favor of more likely ones. Statistics are gathered to decide which possibilities are not likely. The results show good performance and scalability on both a 30 CPU Sun E6500 machine running Solaris and a Linux machine with two Intel Xeon L5320 CPUs for a total of 8 physical cores.
  • Keywords
    XML; document handling; open systems; parallel processing; statistical analysis; Intel Xeon L5320 CPU; Linux machine; Solaris; Sun E6500 machine; parallel XML parsing; preparsing stage; product machines; scientific computing environment; skeleton; speculative p-DFA; statistics; unmodified libxml2; wide-scale grid environment; Application software; Computer science; Linux; Multicore processing; Scalability; Scientific computing; Skeleton; Statistics; Sun; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing (HiPC), 2009 International Conference on
  • Conference_Location
    Kochi
  • Print_ISBN
    978-1-4244-4922-4
  • Electronic_ISBN
    978-1-4244-4921-7
  • Type

    conf

  • DOI
    10.1109/HIPC.2009.5433187
  • Filename
    5433187