Title :
Hybrid Compression of Bitvectors for the FM-Index
Author :
Karkkainen, J. ; Kempa, Dominik ; Puglisi, Simon J.
Author_Institution :
Dept. of Comput. Sci., Univ. of Helsinki, Helsinki, Finland
Abstract :
Compressed bit vectors supporting rank and select operations are the workhorse of compressed data structures. We propose a hybrid scheme for implementing compressed bit vectors, which divides the bit vector into blocks and then chooses the encoding of each block separately from a number of different encoding methods. Hybrid encoding is particularly suitable for bit vectors that have lots of local and regional variation, such as those present in the FM-index, a popular compressed data structure for pattern matching. We propose a specific hybrid combination of three simple encoding methods for FM-index bit vectors achieving superior space-time tradeoffs in experiments.
Keywords :
data compression; data structures; pattern matching; FM-index; bitvectors hybrid compression; compressed bit vectors; compressed data structure; encoding methods; hybrid encoding; pattern matching; Arrays; Bioinformatics; Encoding; Entropy; Indexes; Pattern matching; Compressed bitvectors; FM-index; Succinct data structures;
Conference_Titel :
Data Compression Conference (DCC), 2014
Conference_Location :
Snowbird, UT
DOI :
10.1109/DCC.2014.87