• DocumentCode
    2506735
  • Title

    Efficient creation of statistics over query expressions

  • Author

    Bruno, Nicolas ; Chaudhuri, Surajit

  • Author_Institution
    Columbia Univ., USA
  • fYear
    2003
  • fDate
    5-8 March 2003
  • Firstpage
    201
  • Lastpage
    212
  • Abstract
    Query optimizers use base-table statistics to derive statistics on the subplans that are enumerated during optimization. In practice, traditional optimizers rely on a number of simplifying assumptions, which can compromise the accuracy of cardinality estimates. To address this limitation, we had earlier introduced SITs, which are statistics built over query expressions, and we explained how a traditional optimizer can judiciously use SITs to sidestep the problem of inaccurate estimates. A significant challenge that was not addressed was how to build SITs efficiently in a database system. We present a family of techniques to create SITs. These techniques differ from each other in the trade-off they present between accuracy and efficiency of creation. We also present techniques to efficiently create multiple SITs by taking advantage of the commonalities among their generating query expressions.
  • Keywords
    query processing; relational databases; statistical analysis; base-table statistics; cardinality estimates; database system; query expressions; query optimizers; Cost function; Data engineering; Database systems; Histograms; Performance evaluation; Query processing; Relational databases; Sampling methods; Statistical distributions; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2003. Proceedings. 19th International Conference on
  • Print_ISBN
    0-7803-7665-X
  • Type

    conf

  • DOI
    10.1109/ICDE.2003.1260793
  • Filename
    1260793