• DocumentCode
    3520245
  • Title

    Join Optimization in the MapReduce Environment for Column-wise Data Store

  • Author

    Zhou, Minqi ; Zhang, Rong ; Zeng, Dadan ; Qian, Weining ; Zhou, Aoying

  • Author_Institution
    Software Eng. Inst., East China Normal Univ., Shanghai, China
  • fYear
    2010
  • fDate
    1-3 Nov. 2010
  • Firstpage
    97
  • Lastpage
    104
  • Abstract
    The chain join processing which combines records from two or more tables sequentially has been well studied in the centralized databases. However, it has seldom been discussed in the cloud computing era, and remains imperative to be solved, especially where structured (or relational) data are stored in a column (attribute) wise fashion in distributed file systems (e.g., Google File System) over hundreds of or even thousands of commodities PCs. In this paper, we propose a novel method for chain join processing, which is one of the common primitives in the cloud era for column-wise stored data analysis. By effectively selecting the dedicated records (tuples) for the chain join based on the information exploited within bipartite join graph, communication cost for record transmission could be reduced dramatically. A bushy tree structure is deployed to regulate the chain join sequence, which further reduces the number of intermediate results generated and transmitted, and explores higher parallelism in join processing, while results in more efficient join processing. Our extensive performance study confirms the effectiveness and efficiency of our methods.
  • Keywords
    cloud computing; distributed databases; graph theory; tree data structures; Google file system; MapReduce environment; bipartite join graph; bushy tree structure; centralized database; chain join processing; chain join sequence; cloud computing; column-wise stored data analysis; communication cost; distributed file system; record transmission;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantics Knowledge and Grid (SKG), 2010 Sixth International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-8125-5
  • Electronic_ISBN
    978-0-7695-4189-1
  • Type

    conf

  • DOI
    10.1109/SKG.2010.18
  • Filename
    5663487