Title :
Dynamic Materialization of Query Views for Data Warehouse Workloads
Author :
Phan, Thomas ; Li, Wen-Syan
Author_Institution :
Yahoo!, Inc., Sunnyvale, CA
Abstract :
A materialized view, or Materialized Query Table (MQT), is an auxiliary table with precomputed data that can be used to significantly improve the performance of a database query. Previous research efforts have focused on finding the best candidate MQT set, with a common static heuristic being to greedily pre-materialize the MQTs prior to executing the workload. While this approach is sound when the size of the MQT set on disk is small, it will not be able to pre-materialize all MQTs and indexes when faced with real-world disk limits and view maintenance costs, and thus a static heuristic will fail to exploit the potentially large benefits of those MQTs not selected for materialization. In this paper we present an automated, dynamic MQT management scheme that materializes views and creates indexes in an on-demand fashion as a workload executes and manages them with an LRU cache. In order to maximize the benefit of executing queries with MQTs, the scheme makes an adaptive tradeoff between the MQT materializations, the base table accesses, and the benefit of MQT hits in the cache. To find the workload permutation that produces the overall highest net benefit, we use a genetic algorithm to search the N! solution space, and to avoid materializing seldom-used MQTs, we prune the set of MQT candidates. We ran our dynamic management on a TPC-H workload and found that our scheme produces higher benefit across a variety of scenarios; we demonstrate over 60% improvement in MQT benefit under harsh conditions when the size of available MQTs with indexes is larger than the cache.
Keywords :
cache storage; data warehouses; database indexing; genetic algorithms; query processing; LRU cache; MQT auxiliary table; automated dynamic MQT management scheme; data warehouse workloads; database indexes; database query; dynamic query view materialization; genetic algorithm; materialized query table; Cost function; Data warehouses; Database systems; Dynamic scheduling; Genetic algorithms; Job shop scheduling; Query processing; Radio access networks;
Conference_Titel :
Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-1-4244-1836-7
Electronic_ISBN :
978-1-4244-1837-4
DOI :
10.1109/ICDE.2008.4497452