Let

be a sequence of independent Bernoulli random variables with probability

that

and probability

that

for all

. Time-invariant finite-memory (i.e., finite-state) estimation procedures for the parameter p are considered which take

as an input sequence. In particular, an n-state deterministic estimation procedure is described which can estimate p with mean-square error

and an

-state probabilistic estimation procedure which can estimate

with mean-square error

. It is proved that the

bound is optimal to within a constant factor. In addition, it is shown that linear estimation procedures are just as powerful (up to the measure of mean-square error) as arbitrary estimation procedures. The proofs are based on an analog of the well-known matrix tree theorem that is called the Markov chain tree theorem.