DocumentCode :
2707971
Title :
Real-time traversal in grammar-based compressed files
Author :
Gasieniec, Leszek ; Kolpakov, Roman ; Potapov, Igor ; Sant, Paul
Author_Institution :
Dept. of Comput. Sci., Liverpool Univ., UK
fYear :
2005
fDate :
29-31 March 2005
Firstpage :
458
Abstract :
Summary form only given. In text compression applications, it is important to be able to process compressed data without requiring (complete) decompression. In this context it is crucial to study compression methods that allow time/space efficient access to any fragment of a compressed file without being forced to perform complete decompression. We study here the real-time recovery of consecutive symbols from compressed files, in the context of grammar-based compression. In this setting, a compressed text is represented as a small (a few Kb) dictionary D (containing a set of code words), and a very long (a few Mb) string based on symbols drawn from the dictionary D. The space efficiency of this kind of compression is comparable with standard compression methods based on the Lempel-Ziv approach. We show, that one can visit consecutive symbols of the original text, moving from one symbol to another in constant time and extra O(|D|) space. This algorithm is an improvement of the on-line linear (amortised) time algorithm presented in (L. Gasieniec et al, Proc. 13th Int. Symp. on Fund. of Comp. Theo., LNCS, vol.2138, p.138-152, 2001).
Keywords :
data compression; data structures; dictionaries; grammars; text analysis; code word set; compressed text dictionary representation; compression space efficiency; decompression; dictionary symbols string; grammar-based compressed files; real-time compressed file traversal; real-time consecutive symbol recovery; text compression; time/space fragment access; Computation theory; Computer science; Data compression; Dictionaries; Pattern matching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 2005. Proceedings. DCC 2005
ISSN :
1068-0314
Print_ISBN :
0-7695-2309-9
Type :
conf
DOI :
10.1109/DCC.2005.78
Filename :
1402215
Link To Document :
بازگشت