• DocumentCode
    2456293
  • Title

    Temporal Analytics on Big Data for Web Advertising

  • Author

    Chandramouli, Badrish ; Goldstein, Jonathan ; Duan, Songyun

  • Author_Institution
    Microsoft Res., Redmond, WA, USA
  • fYear
    2012
  • fDate
    1-5 April 2012
  • Firstpage
    90
  • Lastpage
    101
  • Abstract
    "Big Data" in map-reduce (M-R) clusters is often fundamentally temporal in nature, as are many analytics tasks over such data. For instance, display advertising uses Behavioral Targeting (BT) to select ads for users based on prior searches, page views, etc. Previous work on BT has focused on techniques that scale well for offline data using M-R. However, this approach has limitations for BT-style applications that deal with temporal data: (1) many queries are temporal and not easily expressible in M-R, and moreover, the set-oriented nature of M-R front-ends such as SCOPE is not suitable for temporal processing, (2) as commercial systems mature, they may need to also directly analyze and react to real-time data feeds since a high turnaround time can result in missed opportunities, but it is difficult for current solutions to naturally also operate over real-time streams. Our contributions are twofold. First, we propose a novel framework called TiMR (pronounced timer), that combines a time-oriented data processing system with a M-R framework. Users write and submit analysis algorithms as temporal queries - these queries are succinct, scale-out-agnostic, and easy to write. They scale well on large-scale offline data using TiMR, and can work unmodified over real-time streams. We also propose new cost-based query fragmentation and temporal partitioning schemes for improving efficiency with TiMR. Second, we show the feasibility of this approach for BT, with new temporal algorithms that exploit new targeting opportunities. Experiments using real data from a commercial ad platform show that TiMR is very efficient and incurs orders-of-magnitude lower development effort. Our BT solution is easy and succinct, and performs up to several times better than current schemes in terms of memory, learning time, and click-through-rate/coverage.
  • Keywords
    Internet; advertising; data analysis; query processing; BT-style applications; M-R front-ends; SCOPE; TiMR; Web advertising; analytics tasks; behavioral targeting; big data; click-through-rate; cost-based query fragmentation; display advertising; large-scale omine data; map-reduce clusters; page views; prior searches; real-time data feeds; temporal analytics; temporal partitioning schemes; temporal queries; Advertising; Distributed databases; Feeds; Information management; Monitoring; Real time systems; Semantics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2012 IEEE 28th International Conference on
  • Conference_Location
    Washington, DC
  • ISSN
    1063-6382
  • Print_ISBN
    978-1-4673-0042-1
  • Type

    conf

  • DOI
    10.1109/ICDE.2012.55
  • Filename
    6228075