• DocumentCode
    3139736
  • Title

    TPC-H benchmarking of Pig Latin on a Hadoop cluster

  • Author

    Moussa, Rim

  • Author_Institution
    Lab. ITC&E Eng., Univ. of Tunis, Tunis, Tunisia
  • fYear
    2012
  • fDate
    26-28 June 2012
  • Firstpage
    85
  • Lastpage
    90
  • Abstract
    Several companies report success stories after migration from relational database management systems to NoSQL systems (Not only SQL). The latter seem to take over in most data storage fields. These technologies must be used properly, and businesses must be aware of the limitations of NoSQL, for providing real benefits. Pig Latin is a high-level language for expressing data analysis programs, and implementing the MapReduce framework on top of Hadoop Distributed File System. This paper benchmarks Pig Latin using the well known TPC-H benchmark -a Decision Support System benchmark, and reports performance results for different settings on GRID5000 clusters.
  • Keywords
    SQL; data analysis; decision support systems; distributed processing; high level languages; pattern clustering; relational databases; GRID5000 clusters; Hadoop cluster; Hadoop distributed file system; MapReduce framework; NoSQL systems; Pig Latin; TPC-H benchmarking; data analysis programs; data storage fields; decision support system benchmark; high-level language; relational database management systems; Benchmark testing; Business; Engines; High level languages; Parallel processing; Relational databases; Time factors; Hadoop; MapReduce; Pig Latin; TPC-H; analytics; benchmark; cloud;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications and Information Technology (ICCIT), 2012 International Conference on
  • Conference_Location
    Hammamet
  • Print_ISBN
    978-1-4673-1949-2
  • Type

    conf

  • DOI
    10.1109/ICCITechnol.2012.6285848
  • Filename
    6285848