DocumentCode
3139736
Title
TPC-H benchmarking of Pig Latin on a Hadoop cluster
Author
Moussa, Rim
Author_Institution
Lab. ITC&E Eng., Univ. of Tunis, Tunis, Tunisia
fYear
2012
fDate
26-28 June 2012
Firstpage
85
Lastpage
90
Abstract
Several companies report success stories after migration from relational database management systems to NoSQL systems (Not only SQL). The latter seem to take over in most data storage fields. These technologies must be used properly, and businesses must be aware of the limitations of NoSQL, for providing real benefits. Pig Latin is a high-level language for expressing data analysis programs, and implementing the MapReduce framework on top of Hadoop Distributed File System. This paper benchmarks Pig Latin using the well known TPC-H benchmark -a Decision Support System benchmark, and reports performance results for different settings on GRID5000 clusters.
Keywords
SQL; data analysis; decision support systems; distributed processing; high level languages; pattern clustering; relational databases; GRID5000 clusters; Hadoop cluster; Hadoop distributed file system; MapReduce framework; NoSQL systems; Pig Latin; TPC-H benchmarking; data analysis programs; data storage fields; decision support system benchmark; high-level language; relational database management systems; Benchmark testing; Business; Engines; High level languages; Parallel processing; Relational databases; Time factors; Hadoop; MapReduce; Pig Latin; TPC-H; analytics; benchmark; cloud;
fLanguage
English
Publisher
ieee
Conference_Titel
Communications and Information Technology (ICCIT), 2012 International Conference on
Conference_Location
Hammamet
Print_ISBN
978-1-4673-1949-2
Type
conf
DOI
10.1109/ICCITechnol.2012.6285848
Filename
6285848
Link To Document