Title :
Tight lower bounds for the distinct elements problem
Author :
Indyk, Piotr ; Woodruff, David
Author_Institution :
MIT, Cambridge, MA, USA
Abstract :
We prove strong lower bounds for the space complexity of (ε, δ)-approximating the number of distinct elements F0 in a data stream. Let m be the size of the universe from which the stream elements are drawn. We show that any one-pass streaming algorithm for (ε, δ)-approximating F0 must use Ω(1/ε2) space when ε = Ω(m-1(9 + k)/), for any k > 0, improving upon the known lower bound of Ω(1/ε) for this range of ε. This lower bound is tight up to a factor of log log m for small ε and log 1/ε for large ε. Our lower bound is derived from a reduction from the one-way communication complexity of approximating a Boolean function in Euclidean space. The reduction makes use of a low-distortion embedding from an l2 to l1 norm.
Keywords :
Boolean functions; communication complexity; Boolean function; Euclidean space; data stream element; distinct element problem; one-pass streaming algorithm; one-way communication complexity; space complexity; tight lower bound; Algorithm design and analysis; Approximation algorithms; Boolean functions; Complexity theory; Computer crime; Databases; IP networks;
Conference_Titel :
Foundations of Computer Science, 2003. Proceedings. 44th Annual IEEE Symposium on
Print_ISBN :
0-7695-2040-5
DOI :
10.1109/SFCS.2003.1238202