DocumentCode :
267085
Title :
Performance Study of Spindle, A Web Analytics Query Engine Implemented in Spark
Author :
Amos, Brandon ; Tompkins, David
Author_Institution :
Adobe Res. San Jose, San Jose, CA, USA
fYear :
2014
fDate :
15-18 Dec. 2014
Firstpage :
505
Lastpage :
510
Abstract :
This paper shares our experiences building and benchmarking Spindle as an open source Spark-based web analytics platform. Spindle´s design has been motivated by real-world queries and data requiring concurrent, low latency query execution. We identify a search space of Spark tuning options and study their impact on Spark´s performance. Results from a self-hosted six node cluster with one week of analytics data (13.1GB) indicate tuning options such as proper partitioning can cause a 5x performance improvement.
Keywords :
public domain software; query processing; software performance evaluation; Spark tuning options; Spindle performance study; Web analytics query engine; low latency query execution; open source Spark-based Web analytics platform; real-world queries; self-hosted six node cluster; Context; Instruction sets; Libraries; Loading; Production; Sparks; Tuning; data processing; distributed systems; performance study; web analytics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud Computing Technology and Science (CloudCom), 2014 IEEE 6th International Conference on
Conference_Location :
Singapore
Type :
conf
DOI :
10.1109/CloudCom.2014.111
Filename :
7037709
Link To Document :
بازگشت