• DocumentCode
    168787
  • Title

    Scalable Infrastructures for Data in Motion

  • Author

    Ediger, David ; McColl, R. ; Poovey, Jason ; Campbell, Daniel

  • Author_Institution
    Georgia Tech Res. Inst., Atlanta, GA, USA
  • fYear
    2014
  • fDate
    26-29 May 2014
  • Firstpage
    875
  • Lastpage
    882
  • Abstract
    Analytics applications for reporting and human interaction with big data rely upon scalable frameworks for data ingest, storage, and computation. Batch processing of analytic workloads increases latency of results and can perform redundant computation. In real-world applications, new data points are continuously arriving and a suite of algorithms must be updated to reflect the changes. Reducing the latency of re-computation by keeping algorithms online and up-to-date enables fast query, experimentation, and drill-down. In this paper, we share our experiences designing and implementing scalable infrastructure around No SQL databases for social media analytics applications. We propose a new heterogeneous architecture and execution model for streaming data applications that focuses on throughput and modularity.
  • Keywords
    Big Data; SQL; data analysis; social networking (online); NoSQL databases; analytic workloads; batch processing; big data; data in motion; data ingest; data storage; execution model; heterogeneous architecture; recomputation latency reduction; redundant computation; scalable infrastructures; social media analytics applications; streaming data applications; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data structures; Databases; Media; Servers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on
  • Conference_Location
    Chicago, IL
  • Type

    conf

  • DOI
    10.1109/CCGrid.2014.91
  • Filename
    6846541