• DocumentCode
    3268051
  • Title

    Improving hash join performance through prefetching

  • Author

    Chen, Shimin ; Ailamaki, Anastassia ; Gibbons, Phillip B. ; Mowry, Todd C.

  • Author_Institution
    Carnegie Mellon Univ., Pittsburgh, PA, USA
  • fYear
    2004
  • fDate
    30 March-2 April 2004
  • Firstpage
    116
  • Lastpage
    127
  • Abstract
    Hash join algorithms suffer from extensive CPU cache stalls. We show that the standard hash join algorithm/or disk-oriented databases (i.e. GRACE) spends over 73% of its user time stalled on CPU cache misses, and explores the use of prefetching to improve its cache performance. Applying prefetching to hash joins is complicated by the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, that overcome these complications. These schemes achieve 2.0-2.9X speedups for the join phase and 1.4-2.6X speedups for the partition phase over GRACE and simple prefetching approaches. Compared with previous cache-aware approaches (i.e. cache partitioning), the schemes are at least 50% faster on large relations and do not require exclusive use of the CPU cache to be effective.
  • Keywords
    cache storage; database management systems; performance evaluation; CPU cache miss; CPU cache stalls; cache performance; data dependency; disk-oriented database; group prefetching; hash join algorithm; inherent randomness; join phase; multiple code path; partition phase; prefetching; software-pipelined prefetching; Costs; Database systems; Delay; Electric breakdown; Partitioning algorithms; Prefetching; Probes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2004. Proceedings. 20th International Conference on
  • ISSN
    1063-6382
  • Print_ISBN
    0-7695-2065-0
  • Type

    conf

  • DOI
    10.1109/ICDE.2004.1319989
  • Filename
    1319989