DocumentCode
2626755
Title
Parallel implementations of exclusion joins
Author
Shum, Chung-Dak
Author_Institution
Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., Kowloon, Hong Kong
fYear
1993
fDate
1-4 Dec 1993
Firstpage
742
Lastpage
747
Abstract
This paper examines the parallel processing of exclusion join in a shared-nothing multiprocessor environment. First, a parallel hash-based exclusion join algorithm is presented. Unlike the case of equijoin, this algorithm does not work correctly in the presence of nulls in the join attributes. One solution is to restrict the hash-on attributes to non-nullable fields. However, this can lead to the well known data skew problem. If the number of tuples containing null values in their join attributes is small, an alternative is to replicate those tuples to all processors. Otherwise, we can consider a range partitioning algorithm where those tuples are only sent to a small subset of the processors. The hash-based algorithm usually outperforms the range partitioning algorithm except when the number of tuples containing null values in their join attributes is large or when the data is highly skewed
Keywords
database theory; distributed databases; parallel algorithms; query processing; data skew problem; exclusion join; hash-based algorithm; hash-on attributes; parallel hash-based exclusion join algorithm; parallel processing; range partitioning algorithm; shared-nothing multiprocessor environment; Computer science; Database machines; Joining IEEE; Parallel processing; Partitioning algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing, 1993. Proceedings of the Fifth IEEE Symposium on
Conference_Location
Dallas, TX
Print_ISBN
0-8186-4222-X
Type
conf
DOI
10.1109/SPDP.1993.395458
Filename
395458
Link To Document