• DocumentCode
    1484834
  • Title

    Efficient Iceberg Query Evaluation Using Compressed Bitmap Index

  • Author

    He, Bin ; Hsiao, Hui-I ; Liu, Ziyang ; Huang, Yu ; Chen, Yi

  • Author_Institution
    IBM Almaden Research Center, San Jose
  • Volume
    24
  • Issue
    9
  • fYear
    2012
  • Firstpage
    1570
  • Lastpage
    1583
  • Abstract
    Decision support and knowledge discovery systems often compute aggregate values of interesting attributes by processing a huge amount of data in very large databases and/or warehouses. In particular, iceberg query is a special type of aggregation query that computes aggregate values above a user-provided threshold. Usually, only a small number of results will satisfy the threshold constraint. Yet, the results often carry very important and valuable business insights. Because of the small result set, iceberg queries offer many opportunities for deep query optimization. However, most existing iceberg query processing algorithms do not take advantage of the small-result-set property and rely heavily on the tuple-scan-based approach. This incurs intensive disk accesses and computation, resulting in long processing time especially when data size is large. Bitmap index, which builds one bitmap vector for each attribute value, is gaining popularity in both column-oriented and row-oriented databases in recent years. It occupies less space than the raw data and gives opportunities for more efficient query processing. In this paper, we exploited the property of bitmap index and developed a very effective bitmap pruning strategy for processing iceberg queries. Our index-pruning-based approach eliminates the need of scanning and processing the entire data set (table) and thus speeds up the iceberg query processing significantly. Experiments show that our approach is much more efficient than existing algorithms commonly used in row-oriented and column-oriented databases.
  • Keywords
    Aggregates; Heuristic algorithms; Indexes; Query processing; Iceberg query; bitmap index; column-oriented database;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2011.73
  • Filename
    5740885