• DocumentCode
    3029136
  • Title

    Low-storage online estimators for quantiles and densities

  • Author

    Ghosh, Sudip ; Pasupathy, Raghu

  • Author_Institution
    Bus. Analytics & Math Sci, T.J. Watson IBM Res. Center, Yorktown Heights, NY, USA
  • fYear
    2013
  • fDate
    8-11 Dec. 2013
  • Firstpage
    778
  • Lastpage
    789
  • Abstract
    The traditional estimator ξp, n for the p-quantile ξp of a random variable X, given n observations from the distribution of X, is obtained by inverting the empirical cumulative distribution function (cdf) constructed from the obtained observations. The estimator ξp, n requires O(n) storage, and it is well known that the mean squared error of ξp, n (with respect to p) decays as O(n-1). In this article, we present an alternative to ξp, n that seems to require dramatically less storage with negligible loss in convergence rate. The proposed estimator, ξp, n, relies on an alternative cdf that is constructed by accumulating the observed random variâtes into variable-sized bins that progressively become finer around the quantile. The size of the bins are strategically adjusted to ensure that the increased bias due to binning does not adversely affect the resulting convergence rate. We present an "online" version of the estimator ξp, n, along with a discussion of results on its consistency, convergence rates, and storage requirements. We also discuss analogous ideas for density estimation. We limit ourselves to heuristic arguments in support of the theoretical assertions we make, reserving more detailed proofs to a forthcoming paper.
  • Keywords
    computational complexity; convergence; estimation theory; mathematics computing; mean square error methods; random processes; statistical analysis; statistical distributions; alternative CDF; convergence rate; cumulative distribution function; data consistency; density estimation; heuristic arguments; low storage online estimation; mean squared error method; online quantile estimation; random variable; storage requirements; variable sized bins; Computational complexity; Context; Convergence; Distribution functions; Estimation; Monte Carlo methods; Random variables;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Simulation Conference (WSC), 2013 Winter
  • Conference_Location
    Washington, DC
  • Print_ISBN
    978-1-4799-2077-8
  • Type

    conf

  • DOI
    10.1109/WSC.2013.6721470
  • Filename
    6721470