Title :
Exact and approximate algorithms for the index selection problem in physical database design
Author :
Caprara, Alberto ; Fischetti, Matteo ; Maio, Dario
Author_Institution :
Dipartimento di Elettronica, Inf. e Sistemistica, Bologna Univ., Italy
fDate :
12/1/1995 12:00:00 AM
Abstract :
The index selection problem (ISP) is an important optimization problem in the physical design of databases. The aim of this paper is to show that ISP, although NP-hard, can in practice be solved effectively through well-designed algorithms. We formulate ISP as a 0-1 integer linear program and describe an exact branch-and-bound algorithm based on the linear programming relaxation of the model. The performance of the algorithm is enhanced by means of procedures to reduce the size of the candidate index set. We also describe heuristic algorithms based on the solution of a suitably defined knapsack subproblem and on Lagrangian decomposition. Finally, computational results on several classes of test problems are given. We report the exact solution of large-scale ISP instances involving several hundred indexes and queries. We also evaluate one of the heuristic algorithms we propose on very large-scale instances involving several thousand indexes and queries and show that it consistently produces very tight approximate (and sometimes provably optimal) solutions. Finally, we discuss possible extensions and future directions of research
Keywords :
database theory; heuristic programming; indexing; integer programming; linear programming; query processing; relational databases; tree searching; Lagrangian decomposition; NP-hard; approximate algorithms; branch-and-bound algorithm; exact algorithms; heuristic algorithms; index; index selection problem; integer linear program; knapsack subproblem; linear programming relaxation; optimization problem; performance; physical database design; query processing; relational database; Algorithm design and analysis; Costs; Design optimization; Heuristic algorithms; Indexes; Integer linear programming; Lagrangian functions; Large-scale systems; Relational databases; Testing;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on