• DocumentCode
    615424
  • Title

    Evaluating data storage structures of MapReduce

  • Author

    Haiming Lai ; Ming Xu ; Jian Xu ; Yizhi Ren ; Ning Zheng

  • Author_Institution
    Coll. of Comput., Hangzhou Dianzi Univ., Hangzhou, China
  • fYear
    2013
  • fDate
    26-28 April 2013
  • Firstpage
    1041
  • Lastpage
    1045
  • Abstract
    MapReduce framework and its open-source implementation Hadoop, a scalable and fault-tolerant infrastructure for big data analysis on large clusters, can achieve different performance with different data storage structures. This paper evaluates the performance about three kinds of data storage structures of MapReduce, namely row-store, column-store, and RCFile. The evaluating experiments are designed to test three data storage structures in terms of data loading time, data storage space, and query execution time. The experimental results show that RCFile data storage structure can achieve better performance in most cases.
  • Keywords
    data analysis; fault tolerant computing; public domain software; storage management; MapReduce framework; RCFile; big data analysis; column-store; data loading time; data storage space; data storage structure evaluation; fault-tolerant infrastructure; large clusters; open-source implementation Hadoop; query execution time; row-store; Open source software; Weaving; MapReduce; RCFile; column-store; data storage structure; row-stor;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science & Education (ICCSE), 2013 8th International Conference on
  • Conference_Location
    Colombo
  • Print_ISBN
    978-1-4673-4464-7
  • Type

    conf

  • DOI
    10.1109/ICCSE.2013.6554067
  • Filename
    6554067