Abstract :
We analyze the problem of discrete distribution estimation under ℓ1 loss. We provide non-asymptotic upper and lower bounds on the maximum risk of the empirical distribution (the maximum likelihood estimator), and the minimax risk in regimes where the alphabet size S may grow with the number of observations n. We show that among distributions with bounded entropy H, the asymptotic maximum risk for the empirical distribution is 2H / ln n, while the asymptotic minimax risk is H / ln n. Moreover, a hard-thresholding estimator, whose threshold does not depend on the unknown upper bound H, is asymptotically minimax. We draw connections between our work and the literature on density estimation, entropy estimation, total variation distance (ℓ1 divergence) estimation, joint distribution estimation in stochastic processes, normal mean estimation, and adaptive estimation.