Title :
A multi-resolution block storage model for database design
Author :
Zhou, Jingren ; Ross, Kenneth A.
Abstract :
We propose a new storage model called MBSM (multiresolution block storage model) for laying out tables on disks. MBSM is intended to speed up operations such as scans that are typical of data warehouse workloads. Disk blocks are grouped into "super-blocks," with a single record stored in a partitioned fashion among the blocks in a superblock. The intention is that a scan operation that needs to consult only a small number of attributes can access just those blocks of each super-block that contain the desired attributes. To achieve good performance given the physical characteristics of modern disks, we organize super-blocks on the disk into fixed-size "mega-blocks." Within a megablock, blocks of the same type (from various super-blocks) are stored contiguously. We describe the changes needed in a conventional database system to manage tables using such a disk organization. We demonstrate experimentally that MBSM outperforms competing approaches such as NSM (N-ary storage model), DSM (decomposition storage model) and PAX (partition attributes across), for I/O bound decision-support workloads consisting of scans in which not all attributes are required. This improved performance comes at the expense of single-record insert and delete performance; we quantify the trade-offs involved. Unlike DSM, the cost of reconstructing a record from its partitions is small. MBSM stores attributes in a vertically partitioned manner similar to PAX, and thus shares PAX\´s good CPU cache behavior. We describe methods for mapping attributes to blocks within super-blocks in order to optimize overall performance, and show how to tune the super-block and mega-block sizes.
Keywords :
attribute grammars; data handling; data models; data warehouses; database management systems; disc storage; information storage; optimisation; storage allocation; DSM; I/O bound decision-support workload; MBSM model; N-ary storage model; NSM; PAX CPU cache behavior; attribute mapping; attribute storage; conventional database system; data warehouse workload; database design; decomposition storage model; delete performance; disk block; disk organization; disk physical characteristic; fixed-size megablocks; multiresolution block storage model; partition attributes across; partitioned fashion; performance optimization; scan operation; single-record insert; super-block organization; table lay out; table management; vertical partition; Costs; Data engineering; Data warehouses; Database systems; Delay; Information retrieval; Magnetic heads; Multiresolution analysis; Optimization methods; Volume measurement;
Conference_Titel :
Database Engineering and Applications Symposium, 2003. Proceedings. Seventh International
Print_ISBN :
0-7695-1981-4
DOI :
10.1109/IDEAS.2003.1214908