Title :
On the hardness of finding optimal multiple preset dictionaries
Author :
Mitzenmacher, Michael
Author_Institution :
Div. of Eng. & Appl. Sci., Harvard Univ., Cambridge, MA, USA
fDate :
7/1/2004 12:00:00 AM
Abstract :
We show that the following simple compression problem is NP-hard: given a collection of documents, find the pair of Huffman dictionaries that minimizes the total compressed size of the collection, where the best dictionary from the pair is used to compress each document. We also show the NP-hardness of finding optimal multiple preset dictionaries for LZ´77-based compression schemes. Our reductions make use of the catalog segmentation problem, a natural partitioning problem. Our results justify heuristic attacks used in practice.
Keywords :
Huffman codes; data compression; dictionaries; optimisation; Huffman coding; LZ´77-based compression schemes; NP-hard; catalog segmentation problem; natural partitioning problem; optimal multiple preset dictionaries; two-stage compression; Code standards; Computational efficiency; Concurrent computing; Costs; Decoding; Dictionaries; Encoding; Huffman coding; Testing; Transform coding;
Journal_Title :
Information Theory, IEEE Transactions on
DOI :
10.1109/TIT.2004.830778