Author :
Andoni, Alexandr ; Krauthgamer, Robert ; Onak, Krzysztof
Author_Institution :
Microsoft Res., Mountain View, CA, USA
Abstract :
A technique introduced by Indyk and Woodruff (STOC 2005) has inspired several recent advances in data-stream algorithms. We show that a number of these results follow eas- ily from the application of a single probabilistic method called Precision Sampling. Using this method, we obtain simple data- stream algorithms that maintain a randomized sketch of an input vector x = (x1,x2,...,xn), which is useful for the following applications: 1) Estimating the Fk-moment of x, for k >; 2. 2) Estimating the ℓp-norm of x, for p ϵ [1, 2], with small update time. 3) Estimating cascaded norms ℓp(ℓq) for all p,q >; 0. 4) ℓ1 sampling, where the goal is to produce an element i with probability (approximately) |xi|/||x||1. It extends to similarly defined ℓp-sampling, for p ϵ [1, 2]. For all these applications the algorithm is essentially the same: scale the vector x entry-wise by a well-chosen random vector, and run a heavy-hitter estimation algorithm on the resulting vector. Our sketch is a linear function of x, thereby allowing general updates to the vector x. Precision Sampling itself addresses the problem of estimating a sum Σi=1n ai from weak estimates of each real ai ϵ [0,1]. More precisely, the estimator first chooses a desired precision ui ϵ (0,1] for each i ϵ [n], and then it receives an estimate of every ai within additive ui. Its goal is to provide a good approximation to Σai while keeping a tab on the "approximation cost" Σi(1/ui)- Here we refine previous work (Andoni, Krauthgamer, and Onak, FOCS 2010) which shows that as long as Σai = Ω(1), a good multiplicative approximation can be achieved using total precision of only O(n log - ).
Keywords :
computational complexity; function approximation; probability; sampling methods; cascaded norm estimation; data-stream algorithms; heavy-hitter estimation algorithm; linear function; multiplicative approximation; precision sampling; probabilistic method; Algorithm design and analysis; Approximation algorithms; Approximation methods; Estimation; Random variables; Reactive power; Vectors; cascaded norms; moments; sampling; streaming;