DocumentCode :
2188026
Title :
LZ77-Like Compression with Fast Random Access
Author :
Kreft, Sebastian ; Navarro, Gonzalo
Author_Institution :
Dept. of Comput. Sci., Univ. of Chile, Santiago, Chile
fYear :
2010
fDate :
24-26 March 2010
Firstpage :
239
Lastpage :
248
Abstract :
We introduce an alternative Lempel-Ziv text parsing, LZ-End, that converges to the entropy and in practice gets very close to LZ77. LZ-End forces sources to finish at the end of a previous phrase. Most Lempel-Ziv parsings can decompress the text only from the beginning. LZ-End is the only parsing we know of able of decompressing arbitrary phrases in optimal time, while staying closely competitive with LZ77, especially on highly repetitive collections, where LZ77 excells. Thus LZ-End is ideal as a compression format for highly repetitive sequence databases, where access to individual sequences is required, and it also opens the door to compressed indexing schemes for such collections.
Keywords :
data compression; database indexing; grammars; text analysis; LZ77-like compression; Lempel-Ziv text parsing; entropy; fast random access; highly repetitive sequence databases; indexing scheme compression; text decompression; Bioinformatics; Computer science; DNA; Data compression; Data mining; Data structures; Databases; Entropy; Indexing; Sequences;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference (DCC), 2010
Conference_Location :
Snowbird, UT
ISSN :
1068-0314
Print_ISBN :
978-1-4244-6425-8
Electronic_ISBN :
1068-0314
Type :
conf
DOI :
10.1109/DCC.2010.29
Filename :
5453442
Link To Document :
بازگشت