DocumentCode
1877233
Title
SCRAP: A Statistical Approach for Creating a Database Query Workload Based on Performance Bottlenecks
Author
Skarie, James ; Debnath, Biplob K. ; Lilja, David J. ; Mokbel, Mohamed F.
Author_Institution
Electrical and Computer Engineering Department, University of Minnesota, USA. skar0059@umn.edu
fYear
2007
fDate
27-29 Sept. 2007
Firstpage
183
Lastpage
192
Abstract
With the tremendous growth in stored data, the role of database systems has become more significant than ever before. Standard query workloads, such as the TPC-C and TPC-H benchmark suites, are used to evaluate and tune the functionality and performance of database systems. Running and configuring benchmarks is a time consuming task. It requires substantial statistical expertise due to the enormous data size and large number of queries in the workload. Subsetting can be used to reduce the number of queries in a workload. An existing workload subsetting technique selected queries based on similarities of the ranks of the queries for low-level characteristics, such as cache miss rates, or based on the execution time required in different computer systems. However, many low-level characteristics are correlated, produce similar behaviors. Also, raw execution time as a metric is too diffuse to capture important performance bottlenecks. Our goal is to select a subset of queries that can reproduce the same bottlenecks in the system as the original workload. In this paper, we propose a statistical approach for creating a database query workload based on performance bottlenecks (SCRAP). Our methodology takes a query workload and a set of system configuration parameters as inputs, and selects a subset of the queries from the workload based on the similarity of performance bottlenecks. Experimental results using the TPC-H benchmark and the PostgreSQL database system, show that the reduced workload and the original workload produce similar performance bottlenecks, and the subset accurately estimates the total execution time.
Keywords
Buildings; Computer science; Costs; Data engineering; Database systems; Indexes; Internet; Runtime; System performance;
fLanguage
English
Publisher
ieee
Conference_Titel
Workload Characterization, 2007. IISWC 2007. IEEE 10th International Symposium on
Conference_Location
Boston, MA, USA
Print_ISBN
978-1-4244-1561-8
Electronic_ISBN
978-1-4244-1562-5
Type
conf
DOI
10.1109/IISWC.2007.4362194
Filename
4362194
Link To Document