Title :
Distribution of relations in parallel database based on PC clusters
Author_Institution :
Sch. of Comput. Sci. & Technol., Heilongjiang Univ., Harbin, China
Abstract :
In parallel database system, optimizing distribution of relations could improve processing efficiency of multi-join queries greatly. The cost of data communication is expensive in parallel system based on PC clusters. This paper proposes distribution of relations algorithm to select an appropriate data placement strategy for each relation, which includes selection of distribution attributes and nodes. The algorithm could make best use of intra-operator parallelism, independent inter-operator parallelism and pipelined parallelism of PC clusters system. At the same time, it could reduce additional communication cost of data redistribution. The result of experiment indicates the algorithm has good performance and contributes to promoting execution efficiency of parallel multi-join queries.
Keywords :
parallel processing; query processing; relational databases; PC clusters; data placement strategy; independent inter-operator parallelism; intra-operator parallelism; multijoin query processing; parallel database; pipelined parallelism; relation distribution; Bandwidth; Clustering algorithms; Computer science; Cost function; Data communication; Multidimensional systems; Parallel processing; Pipelines; Probability; Relational databases; PC clusters; data redistribution; multi-join query; pipelined parallelism;
Conference_Titel :
Advanced Computer Control (ICACC), 2010 2nd International Conference on
Conference_Location :
Shenyang
Print_ISBN :
978-1-4244-5845-5
DOI :
10.1109/ICACC.2010.5486843