DocumentCode :
818456
Title :
Error Resilient LZ´77 Data Compression: Algorithms, Analysis, and Experiments
Author :
Lonardi, Stefano ; Szpankowski, Wojciech ; Ward, Mark Daniel
Author_Institution :
Dept. of Comput. Sci. & Eng., California Univ., Riverside, CA
Volume :
53
Issue :
5
fYear :
2007
fDate :
5/1/2007 12:00:00 AM
Firstpage :
1799
Lastpage :
1813
Abstract :
We propose a joint source-channel coding algorithm capable of correcting some errors in the popular Lempel-Ziv´77 (LZ´77) scheme without introducing any measurable degradation in the compression performance. This can be achieved because the LZ´77 encoder does not completely eliminate the redundancy present in the input sequence. One source of redundancy can be observed when an LZ´77 phrase has multiple matches. In this case, LZ´77 can issue a pointer to any of those matches, and a particular choice carries some additional bits of information. We call a scheme with embedded redundant information the LZS´77 algorithm. We analyze the number of longest matches in such a scheme and prove that it follows the logarithmic series distribution with mean 1/h (plus some fluctuations), where h is the source entropy. Thus, the distribution associated with the number of redundant bits is well concentrated around its mean, a highly desirable property for error correction. These analytic results are proved by a combination of combinatorial, probabilistic, and analytic methods (e.g., Mellin transform, depoissonization, combinatorics on words). In fact, we analyze LZS´77 by studying the multiplicity matching parameter in a suffix tree, which in turn is analyzed via comparison to its independent version, called trie. Finally, we present an algorithm in which a channel coder (e.g., Reed-Solomon (RS) coder) succinctly uses the inherent additional redundancy left by the LZS´77 encoder to detect and correct a limited number of errors. We call such a scheme the LZRS´77 algorithm. LZRS´77 is perfectly backward-compatible with LZ´77, that is, a file compressed with our error-resistant LZRS´77 can still be decompressed by a generic LZ´77 decoder
Keywords :
combined source-channel coding; data compression; decoding; entropy codes; error correction codes; probability; Lempel-Ziv´77 scheme; decoder; embedded redundant information; error correction; error resilient LZ´77 data compression; joint source-channel coding algorithm; logarithmic series distribution; probabilistic method; source entropy; Algorithm design and analysis; Combinatorial mathematics; Data analysis; Data compression; Degradation; Entropy; Error correction; Fluctuations; Redundancy; Reed-Solomon codes; Autocorrelation polynomial; Lempel– Ziv´77 (LZ´77) scheme; Mellin transform; Reed–Solomon (RS) code; combinatorics on words; depoissonization; joint source–channel coding; multiple matches; pattern matching; suffix trees; tries;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/TIT.2007.894689
Filename :
4167743
Link To Document :
بازگشت