• DocumentCode
    3077471
  • Title

    Materialized view size estimation using sampling

  • Author

    Bhan, Madhu ; Kumar, T.V.S. ; Rajanikanth, K.

  • Author_Institution
    M.S. Ramaiah Inst. of Technol., Bangalore, India
  • fYear
    2013
  • fDate
    26-28 Dec. 2013
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Online Analytical Processing aims at gaining useful information quickly from large amounts of data residing in a Data warehouse. To reduce the time of executing aggregate queries in a data warehousing environment, frequently used aggregates are often pre-computed and materialized in the form of summary views so that future queries can use them directly. Undoubtedly, materializing these summary views can minimize query response time but as the number of views increases exponentially with the number of dimensions, all the views cannot be materialized. However under storage constraint, we must be able to select an optimal number of views to be materialized. To determine the views to be materialized, a Data warehouse administrator may not be able to afford the necessary time to determine the number of rows(size) in each view. Counting actual number of rows present in each view takes considerable time. We explore the use of sampling to estimate the size of views. In this paper we propose a hybrid estimator that takes into account the degree of skew in the data and combines Jacknife estimator with the Schlosser estimator to estimate the size of the view more accurately. The proposed hybrid estimator has been used to estimate the view size in tpc-h benchmark and the results show better estimation results as compared to individual estimators.
  • Keywords
    data mining; data warehouses; sampling methods; Jacknife estimator; OLAP systems; Schlosser estimator; data warehouse; materialized view size estimation; online analytical processing; Benchmark testing; Data warehouses; Databases; Estimation; Lattices; Sociology; Statistics; Data warehouse; Materialized views; Online Analytical Processing; Sampling; Size estimation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Computing Research (ICCIC), 2013 IEEE International Conference on
  • Conference_Location
    Enathi
  • Print_ISBN
    978-1-4799-1594-1
  • Type

    conf

  • DOI
    10.1109/ICCIC.2013.6724143
  • Filename
    6724143