DocumentCode
3357285
Title
Finding correlated heavy-hitters over data streams
Author
Lahiri, Bibudh ; Tirthapura, Srikanta
Author_Institution
Dept. of Electr. & Comput. Eng., Iowa State Univ., Ames, IA, USA
fYear
2009
fDate
14-16 Dec. 2009
Firstpage
307
Lastpage
314
Abstract
We consider online mining of correlated heavy-hitters (CHH) from network data streams. Given a multidimensional dataset, a correlated aggregate query first filters a subset by applying a predicate along a primary dimension, and then computes aggregates along a secondary dimension of that subset data. We consider queries of the following form: "In a stream of (x, y) tuples, on the subset H of all x values that are heavy-hitters, maintain those y values that occur frequently with the x values in H". This query arises naturally in situations where we need to track not only the identity of frequently occurring elements in a stream, but also additional information associated with these elements along other dimensions. Prior work on tracking heavy-hitters has focused only on tracking the identity and frequency of heavy-hitters on a single dimensional stream, and yield little information about correlated heavy-hitters. Our online data stream algorithm is easy to implement and uses workspace which is orders of magnitude smaller than the stream itself. We present provable guarantees on the maximum error estimates, as well as experimental results, that demonstrate the space-accuracy trade-off on a large data stream of packet headers from a backbone network link.
Keywords
data mining; query processing; backbone network link; correlated aggregate query; correlated heavy-hitters; maximum error estimates; network data streams; online mining; packet headers; space-accuracy trade-off; Aggregates; Data engineering; Databases; Filters; Frequency; Multidimensional systems; Network servers; Spine; Telecommunication traffic; Data streams; correlation; heavy-hitters; sketches;
fLanguage
English
Publisher
ieee
Conference_Titel
Performance Computing and Communications Conference (IPCCC), 2009 IEEE 28th International
Conference_Location
Scottsdale, AZ
ISSN
1097-2641
Print_ISBN
978-1-4244-5737-3
Type
conf
DOI
10.1109/PCCC.2009.5403820
Filename
5403820
Link To Document