Title :
Straggler Identification in Round-Trip Data Streams via Newton´s Identities and Invertible Bloom Filters
Author :
Eppstein, David ; Goodrich, Michael T.
Author_Institution :
Dept. of Comput. Sci., Univesity of California, Irvine, CA, USA
Abstract :
In this paper, we study the straggler identification problem, in which an algorithm must determine the identities of the remaining members of a set after it has had a large number of insertion and deletion operations performed on it, and now has relatively few remaining members. The goal is to do this in o(n) space, where n is the total number of identities. Straggler identification has applications, for example, in determining the unacknowledged packets in a high-bandwidth multicast data stream. We provide a deterministic solution to the straggler identification problem that uses only O(d log n) bits, based on a novel application of Newton´s identities for symmetric polynomials. This solution can identify any subset of d stragglers from a set of n O(log n)-bit identifiers, assuming that there are no false deletions of identities not already in the set. Indeed, we give a lower bound argument that shows that any small-space deterministic solution to the straggler identification problem cannot be guaranteed to handle false deletions. Nevertheless, we provide a simple randomized solution, using O(d log n log (1/∈)) bits that can maintain a multiset and solve the straggler identification problem, tolerating false deletions, where ∈ > 0 is a user-defined parameter bounding the probability of an incorrect response. This randomized solution is based on a new type of Bloom filter, which we call the invertible Bloom filter.
Keywords :
computational complexity; filtering theory; polynomials; security of data; Newton identities; false deletion tolerance; high-bandwidth multicast data stream; invertible bloom filters; round-trip data streams; straggler identification problem; symmetric polynomials; Algorithm design and analysis; Complexity theory; Finite element methods; Newton method; Polynomials; Servers; Bloom filters; Newton´s identities; Straggler identification; data streams.;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2010.132