DocumentCode
1959673
Title
Handling data skew in parallel hash join computation using two-phase scheduling
Author
Zhou, Xiaofang ; Orlowska, Maria E.
Author_Institution
Div. of Inf. Technol., CSIRO, Canberra, ACT, Australia
Volume
2
fYear
1995
fDate
19-21 Apr 1995
Firstpage
527
Abstract
A large number of parallel join algorithms has been proposed to maintain load-balancing in the presence of data skew. However, one important type of data skew-join product skew (JPS)-has been little studied. In this paper, a dynamic parallel join algorithm, which employs a two-phase scheduling procedure, is designed to handle the JPS problem. Two sets of scheduling heuristics are studied against various parameters. It is shown that many of the existing algorithms can be regarded as a special case of our algorithm, whose cost is based on the nature of data skew. While it can cope with JPS which other algorithms cannot approach, it can be as efficient as most existing algorithms when JPS does not exist
Keywords
parallel algorithms; processor scheduling; query processing; relational databases; resource allocation; data skew; dynamic parallel join algorithm; load-balancing; parallel hash join computation; two-phase scheduling; two-phase scheduling procedure; Algorithm design and analysis; Computer science; Concurrent computing; Dynamic scheduling; Government; Information technology; Parallel architectures; Processor scheduling; Relational databases; Scheduling algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Algorithms and Architectures for Parallel Processing, 1995. ICAPP 95. IEEE First ICA/sup 3/PP., IEEE First International Conference on
Conference_Location
Brisbane, Qld.
Print_ISBN
0-7803-2018-2
Type
conf
DOI
10.1109/ICAPP.1995.472237
Filename
472237
Link To Document