DocumentCode
168787
Title
Scalable Infrastructures for Data in Motion
Author
Ediger, David ; McColl, R. ; Poovey, Jason ; Campbell, Daniel
Author_Institution
Georgia Tech Res. Inst., Atlanta, GA, USA
fYear
2014
fDate
26-29 May 2014
Firstpage
875
Lastpage
882
Abstract
Analytics applications for reporting and human interaction with big data rely upon scalable frameworks for data ingest, storage, and computation. Batch processing of analytic workloads increases latency of results and can perform redundant computation. In real-world applications, new data points are continuously arriving and a suite of algorithms must be updated to reflect the changes. Reducing the latency of re-computation by keeping algorithms online and up-to-date enables fast query, experimentation, and drill-down. In this paper, we share our experiences designing and implementing scalable infrastructure around No SQL databases for social media analytics applications. We propose a new heterogeneous architecture and execution model for streaming data applications that focuses on throughput and modularity.
Keywords
Big Data; SQL; data analysis; social networking (online); NoSQL databases; analytic workloads; batch processing; big data; data in motion; data ingest; data storage; execution model; heterogeneous architecture; recomputation latency reduction; redundant computation; scalable infrastructures; social media analytics applications; streaming data applications; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data structures; Databases; Media; Servers;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on
Conference_Location
Chicago, IL
Type
conf
DOI
10.1109/CCGrid.2014.91
Filename
6846541
Link To Document