DocumentCode :
2456293
Title :
Temporal Analytics on Big Data for Web Advertising
Author :
Chandramouli, Badrish ; Goldstein, Jonathan ; Duan, Songyun
Author_Institution :
Microsoft Res., Redmond, WA, USA
fYear :
2012
fDate :
1-5 April 2012
Firstpage :
90
Lastpage :
101
Abstract :
"Big Data" in map-reduce (M-R) clusters is often fundamentally temporal in nature, as are many analytics tasks over such data. For instance, display advertising uses Behavioral Targeting (BT) to select ads for users based on prior searches, page views, etc. Previous work on BT has focused on techniques that scale well for offline data using M-R. However, this approach has limitations for BT-style applications that deal with temporal data: (1) many queries are temporal and not easily expressible in M-R, and moreover, the set-oriented nature of M-R front-ends such as SCOPE is not suitable for temporal processing, (2) as commercial systems mature, they may need to also directly analyze and react to real-time data feeds since a high turnaround time can result in missed opportunities, but it is difficult for current solutions to naturally also operate over real-time streams. Our contributions are twofold. First, we propose a novel framework called TiMR (pronounced timer), that combines a time-oriented data processing system with a M-R framework. Users write and submit analysis algorithms as temporal queries - these queries are succinct, scale-out-agnostic, and easy to write. They scale well on large-scale offline data using TiMR, and can work unmodified over real-time streams. We also propose new cost-based query fragmentation and temporal partitioning schemes for improving efficiency with TiMR. Second, we show the feasibility of this approach for BT, with new temporal algorithms that exploit new targeting opportunities. Experiments using real data from a commercial ad platform show that TiMR is very efficient and incurs orders-of-magnitude lower development effort. Our BT solution is easy and succinct, and performs up to several times better than current schemes in terms of memory, learning time, and click-through-rate/coverage.
Keywords :
Internet; advertising; data analysis; query processing; BT-style applications; M-R front-ends; SCOPE; TiMR; Web advertising; analytics tasks; behavioral targeting; big data; click-through-rate; cost-based query fragmentation; display advertising; large-scale omine data; map-reduce clusters; page views; prior searches; real-time data feeds; temporal analytics; temporal partitioning schemes; temporal queries; Advertising; Distributed databases; Feeds; Information management; Monitoring; Real time systems; Semantics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2012 IEEE 28th International Conference on
Conference_Location :
Washington, DC
ISSN :
1063-6382
Print_ISBN :
978-1-4673-0042-1
Type :
conf
DOI :
10.1109/ICDE.2012.55
Filename :
6228075
Link To Document :
بازگشت