Author_Institution :
Dept. of Biomed. Eng., Istanbul Medipol Univ., Istanbul, Turkey
Abstract :
On a given vector X = (x1, x2,..., xn) of integers, the range selection (i, j, k) query is finding the k-th smallest integer in (xi, xi+1,..., xj) for any (i, j, k) such that 1 ≤ i ≤ j ≤ n, and 1 ≤ k ≤ j - i + 1. Previous studies on the problem kept X intact and proposed data structures that occupied additional O(n · log n) bits of space over the X itself that answer the queries in logarithmic time. In this study, we replace X and encode all integers in it via a single wavelet tree by using S = n · log u + Σ∀i log xi + o(n · log u + Σ∀i log xi) bits, where u is the number of distinct ⌊log xi⌋ values observed in X. Notice that u is at most 32 (64) for 32-bit (64-bit) integers and when xi > u, the space used for xi in the proposed data structure is less then the Elias-δ coding of xi. Besides data-aware coding of X, the range selection is performed in O(log u + log x´) time where x´ is the k-th smallest integer in the queried range. This somewhat adaptive result interestingly achieves the range selection regardless of the size of X, and totally depends on the actual answer of the query. In summary, to the best of our knowledge, we present the first algorithm using data-aware space and time for the general range selection problem.
Keywords :
computational complexity; data structures; query processing; trees (mathematics); wavelet transforms; data aware space; data aware time; data structure; data-aware coding; general range selection problem; logarithmic time; query answering; range selection queries; wavelet tree; Classification algorithms; Complexity theory; Data compression; Data structures; Encoding; Sorting; Vegetation; compact integer coding; range selection; wavelet tree;