Title :
Universal Noiseless Compression for Noisy Data
Author :
Shamir, Gil I. ; Tjalkens, Tjalling J. ; Willems, Frans M J
Author_Institution :
Univ. of Utah, Salt Lake City
fDate :
Jan. 29 2007-Feb. 2 2007
Abstract :
We study universal compression for discrete data sequences that were corrupted by noise. We show that while, as expected, there exist many cases in which the entropy of these sequences increases from that of the original data, somewhat surprisingly and counter-intuitively, universal coding redundancy of such sequences cannot increase compared to the original data. We derive conditions that guarantee that this redundancy does not decrease asymptotically (in first order) from the original sequence redundancy in the stationary memoryless case. We then provide bounds on the redundancy for coding finite length (large) noisy blocks generated by stationary memoryless sources and corrupted by some specific memoryless channels. Finally, we propose a sequential probability estimation method that can be used to compress binary data corrupted by some noisy channel. While there is much benefit in using this method in compressing short blocks of noise corrupted data, the new method is more general and allows sequential compression of binary sequences for which the probability of a bit is known to be limited within any given interval (not necessarily between 0 and 1). Additionally, this method has many different applications, including, prediction, sequential channel estimation, and others.
Keywords :
binary sequences; data compression; entropy; memoryless systems; probability; redundancy; signal denoising; binary data; discrete data sequences; finite length noisy blocks; memoryless channels; noisy channel; noisy data; sequential compression; sequential probability estimation; stationary memoryless sources; universal coding redundancy; universal noiseless compression; Binary sequences; Channel estimation; Entropy; Gas insulated transmission lines; Memoryless systems; Noise generators; Noise reduction; Redundancy; Statistics;
Conference_Titel :
Information Theory and Applications Workshop, 2007
Conference_Location :
La Jolla, CA
Print_ISBN :
978-0-615-15314-8
DOI :
10.1109/ITA.2007.4357603