• DocumentCode
    2860280
  • Title

    An XML Data Placement Strategy for Distributed XML Storage and Parallel Query

  • Author

    Zhang, Jing ; Lang, Bo ; Duan, Yawei

  • Author_Institution
    State Key Lab. of Software Dev. Environ., Beihang Univ., Beijing, China
  • fYear
    2011
  • fDate
    20-22 Oct. 2011
  • Firstpage
    433
  • Lastpage
    439
  • Abstract
    Since there has been significant amount of XML documents generated in various application domains, efficient XML management has become an important problem. Distributed XML storage and parallel query based on Map Reduce can be an effective solution to this problem. As XML data placement strategy is a key factor of parallel system performance, in this paper we present an XML placement strategy, which is Query Workload Estimation based XML Placement strategy (QWEXP) for efficient distributed XML storage and parallel query. To achieve query workload balance, it partitions XML based on query workload estimation which is calculated by XML structure without knowing of user queries, considering that in common application scenarios user queries are unknown in advance. The partitioned XML segments are around an XML storage unit W0, to support scalability of parallel XML database. Finally segments are distributed to each processing node evenly to ensure workload balance on parallel query execution. Experimental results have shown that QWEXP promotes the speedup and scale up properties of parallel XML system greatly.
  • Keywords
    XML; data structures; document handling; parallel processing; query processing; storage management; MapReduce; QWEXP; XML data placement strategy; XML document management; XML segment; XML storage unit; XML structure; distributed XML storage; parallel XML database; parallel query execution; parallel system performance; query workload estimation based XML placement strategy; Benchmark testing; Distributed databases; Estimation; Partitioning algorithms; Scalability; XML; XML data placement; distributed XML storage; parallel XML query;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Computing, Applications and Technologies (PDCAT), 2011 12th International Conference on
  • Conference_Location
    Gwangju
  • Print_ISBN
    978-1-4577-1807-6
  • Type

    conf

  • DOI
    10.1109/PDCAT.2011.19
  • Filename
    6118543