Title :
Improved behaviour of tries by the "symmetrization" of the source
Author :
Reznik, Y. Uriy A ; Szpankowski, Wojciech
Author_Institution :
RealNetworks Inc., Seattle, WA, USA
Abstract :
In this paper, we propose and study a pre-processing technique for improving performance of digital tree (trie)-based search algorithms under asymmetric memoryless sources. This technique (which we call a symmetrization of the source) bijectively maps the sequences of symbols from the original (asymmetric) source into symbols of an output alphabet resulting in a more uniform distribution. We introduce a criterion of efficiency for such a mapping, and demonstrate that a problem of finding an optimal construction for a given source (or universal) symmetrization transform is equivalent to a problem of constructing a minimum redundancy variable-length-to-block code for this source (or class of sources). Based on this result, we propose search algorithms that incorporate known (optimal for a given source and universal) variable-length-to-block codes and study their asymptotic behaviour. We complement our analysis with a description of an efficient algorithm for universal symmetrization of binary memoryless sources, and compare the performance of the resulting search structure with the standard tries.
Keywords :
binary sequences; block codes; memoryless systems; minimisation; redundancy; tree codes; tree data structures; tree searching; variable length codes; asymmetric memoryless sources; asymptotic behaviour; bijective mapping; binary memoryless sources; digital tree search algorithms; minimum redundancy code; optimisation; output alphabet; performance; pre-processing technique; sequences; tries; uniform distribution; universal symmetrization; variable-length-to-block code; Data compression;
Conference_Titel :
Data Compression Conference, 2002. Proceedings. DCC 2002
Print_ISBN :
0-7695-1477-4
DOI :
10.1109/DCC.2002.999975