Title :
Search and Modification in Compressed Texts
Author :
Böttcher, Stefan ; Bültmann, Alexander ; Hartel, Rita
Author_Institution :
EIM - Electr. Eng., Comput. Sci. & Math., Univ. of Paderborn, Paderborn, Germany
Abstract :
Text compression techniques like bzip2 lack the possibility to search or to update substrings at given positions of texts that have been compressed without prior decompression of the compressed text. We have developed Indexed Reversible Transformation (IRT), a modified version of the Burrows-Wheeler-Transformation (BWT) that in combination with run length encoding (RLE) and wavelet trees (WT) allows for position-based searching and updating substrings of compressed texts without prior decompression of the compressed text. As a result, IRT may be useful for a huge class of applications that due to space limitations prefer to search or to modify compressed texts instead of uncompressed texts.
Keywords :
data compression; encoding; tree searching; wavelet transforms; burrows wheeler transformation; indexed reversible transformation; position based searching; run length encoding; text compression; wavelet trees; Arrays; Compressors; Distance measurement; Encoding; Indexes; Merging; Sorting; BWT; block sorting; delete; insert; modification in compressed texts; search;
Conference_Titel :
Data Compression Conference (DCC), 2011
Conference_Location :
Snowbird, UT
Print_ISBN :
978-1-61284-279-0
DOI :
10.1109/DCC.2011.47