Convergence and loss bounds for Bayesian sequence prediction

Author

Hutter, Marcus

Author_Institution

AI Inst. IDSIA, Manno-Lugano, Switzerland

Volume

49

Issue

8

fYear

2003

Firstpage

2061

Lastpage

2067

Abstract

The probability of observing x_t at time t, given past observations x₁...x_t-1 can be computed if the true generating distribution μ of the sequences x₁x₂x₃... is known. If μ is unknown, but known to belong to a class ℳ one can base one´s prediction on the Bayes mix ξ defined as a weighted sum of distributions ν ∈ ℳ. Various convergence results of the mixture posterior ξ_t to the true posterior μ_t are presented. In particular, a new (elementary) derivation of the convergence ξ_t/μ_t → 1 is provided, which additionally gives the rate of convergence. A general sequence predictor is allowed to choose an action y_t based on x₁...x_t-1 and receives loss ℓ_x(t)y(t) if x_t is the next symbol of the sequence. No assumptions are made on the structure of ℓ (apart from being bounded) and ℳ. The Bayes-optimal prediction scheme Λ_ξ based on mixture ξ and the Bayes-optimal informed prediction scheme Λ_μ are defined and the total loss L_ξ of Λ_ξ is bounded in terms of the total loss L_μ of Λ_μ. It is shown that L_ξ is bounded for bounded L_μ and L_ξ/L_μ → 1 for L_μ → ∞. Convergence of the instantaneous losses is also proven.

Keywords

Bayes methods; information theory; probability; random sequences; Bayes mix; Bayes-optimal prediction scheme; Bayesian sequence prediction; instantaneous losses; loss bounds; rate of convergence; total loss; true generating distribution; weighted sum; Artificial intelligence; Bayesian methods; Convergence; Distributed computing; Inference algorithms; Machine learning; Prediction algorithms; Probability distribution; Source coding;

fLanguage

English

Journal_Title

Information Theory, IEEE Transactions on

Publisher

ieee

ISSN

0018-9448

Type

jour

DOI

10.1109/TIT.2003.814488

Filename

1214087