Title :
Processing performance on Apache Pig, Apache Hive and MySQL cluster
Author :
Fuad, Ammar ; Erwin, Alva ; Ipung, Heru Purnomo
Author_Institution :
Inf. Technol., Swiss German Univ. Edutown BSD City, Tangerang, Indonesia
Abstract :
MySQL Cluster is a famous clustered database that is used to store and manipulate data. The problem with MySQL Cluster is that as the data grows larger, the time required to process the data increases and additional resources may be needed. With Hadoop and Hive and Pig, processing time can be faster than MySQL Cluster. In this paper, three data testers with the same data model will run simple queries and to find out at how many rows Hive or Pig is faster than MySQL Cluster. The data model taken from GroupLens Research Project [12] showed a result that Hive is the most appropriate for this data model in a low-cost hardware environment.
Keywords :
SQL; data handling; Hadoop; Hive; MySQL cluster; Pig; apache hive; apache pig; clustered database; data model; grouplens research project; hardware environment; processing performance; Data models; Distributed databases; Educational institutions; Hardware; Motion pictures; Servers; Sorting; Hadoop; Hive; MySQL; MySQL Cluster; Pig; Processing big data;
Conference_Titel :
Information, Communication Technology and System (ICTS), 2014 International Conference on
Conference_Location :
Surabaya
Print_ISBN :
978-1-4799-6857-2
DOI :
10.1109/ICTS.2014.7010600