Author :
Milidiú, Ruy Luiz ; Pessoa, Artur Alves ; Laber, Eduardo Sany
Abstract :
The minimum-redundancy prefix code problem is to determine, for a given list W=[ω1,..., ωn] of n positive symbol weights, a list L=[l1,...,ln] of n corresponding integer codeword lengths such that Σi=1 n 2-li⩽1 and Σi=1n ωili is minimized. Let us consider the case where W is already sorted. In this case, the output list L can be represented by a list M=[m1,..., mH], where ml, for l=1,...,H, denotes the multiplicity of the codeword length l in L and H is the length of the greatest codeword. Fortunately, H is proved to be O(min(log(1/p1),n)), where p1 is the smallest symbol probability, given by ω1/Σ i=1n ωi. We present the Fast LazyHuff (F-LazyHuff), the Economical LazyHuff (E-LazyHuff), and the Best LazyHuff (B-LazyHuff) algorithms. F-LazyHuff runs in O(n) time but requires O(min(H2, n)) additional space. On the other hand, E-LazyHuff runs in O(n+nlog(n/H)) time, requiring only O(H) additional space. Finally, B-LazyHuff asymptotically overcomes, the previous algorithms, requiring only O(n) time and O(H) additional space. Moreover, our three algorithms have the advantage of not writing over the input buffer during code calculation, a feature that is very useful in some applications
Keywords :
Huffman codes; computational complexity; redundancy; runlength codes; Best LazyHuff algorithm; Economical LazyHuff algorithm; Fast LazyHuff algorithm; Lazy traversal algorithm; data compression; homogenization technique; input buffer; integer codeword length; minimum redundancy coding; minimum-redundancy prefix codes; positive symbol weights; runlength Huffman algorithm; smallest symbol probability; space-economical algorithms; Binary trees; Data compression; Informatics; Source coding; Writing;