Enumerative coding for tree sources

Author

Martín, Álvaro ; Seroussi, Gadiel ; Weinberger, Marcelo

Author_Institution

Univ. de la Republica, Montevideo

fYear

2008

fDate

5-9 May 2008

Firstpage

266

Lastpage

267

Abstract

Given a parametric model family, the set of sequences of length n over a finite alphabet A is partitioned into type classes, where two sequences belong to the same class if and only if they are assigned the same probability by all models in the family. Since all sequences in a class are equiprobable, the universal probability assignment problem for the given model family reduces to optimally assigning probabilities to type classes. This reduction is optimally performed by the Normalized Maximum Likelihood (NML) code, which can be interpreted as a description of the type, generated by assigning to it a probability proportional to its ML probability, followed by an enumeration of the sequences in the type class. Unfortunately, implementing the NML code is difficult even for the simplest model classes. Other universal methods, based, for example, on the Krichevskii-Trofimov sequential probability assignment, are computationally efficient and also assign the same code length to all the sequences of a given type. They do not, however, provide a separate and identifiable description of the type. In this paper, we are interested in universal enumerative source codes that possess both qualities: they provide a separate description of the type class of the encoded sequence, and this description can be efficiently computed. By "efficient computation" we mean one with running time that is polynomial in the length of the input sequence, as well as in the number of model parameters.

Keywords

maximum likelihood estimation; probability; sequences; source coding; tree codes; trees (mathematics); Krichevskii-Trofimov sequential probability assignment; ML probability; NML code; finite alphabet; normalized maximum likelihood; parametric model family; tree sources; universal enumerative source codes; universal probability assignment problem; Convergence; Cost function; Data compression; Encoding; Entropy; Information theory; Laboratories; Parametric statistics; Polynomials; Random sequences;

fLanguage

English

Publisher

ieee

Conference_Titel

Information Theory Workshop, 2008. ITW '08. IEEE

Conference_Location

Porto

Print_ISBN

978-1-4244-2269-2

Electronic_ISBN

978-1-4244-2271-5

Type

conf

DOI

10.1109/ITW.2008.4578664

Filename

4578664