• DocumentCode
    3678403
  • Title

    Evaluating R-Based Big Data Analytic Frameworks

  • Author

    Mei Liang;Cesar Trejo;Lavanya Muthu;Linh B. Ngo;Andre Luckow;Amy W. Apon

  • Author_Institution
    Sch. of Comput., Clemson Univ., Clemson, SC, USA
  • fYear
    2015
  • Firstpage
    508
  • Lastpage
    509
  • Abstract
    We study the two approaches, rHadoop and H2O, to intergate R, a popular statistical programming environment, into the Hadoop Big Data ecosystem. Using these approaches and the vanilla implementation of MapReduce to implement the solution to an analytic question for the on-time airline performance data set, we evaluate the differences in runtime performance and elaborate on the causes of these differences based on rHadoop and H2O´s design principles.
  • Keywords
    "Water","Big data","Standards","Sparks","Ecosystems","Complexity theory","Parallel processing"
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2015.86
  • Filename
    7307633