DocumentCode
615424
Title
Evaluating data storage structures of MapReduce
Author
Haiming Lai ; Ming Xu ; Jian Xu ; Yizhi Ren ; Ning Zheng
Author_Institution
Coll. of Comput., Hangzhou Dianzi Univ., Hangzhou, China
fYear
2013
fDate
26-28 April 2013
Firstpage
1041
Lastpage
1045
Abstract
MapReduce framework and its open-source implementation Hadoop, a scalable and fault-tolerant infrastructure for big data analysis on large clusters, can achieve different performance with different data storage structures. This paper evaluates the performance about three kinds of data storage structures of MapReduce, namely row-store, column-store, and RCFile. The evaluating experiments are designed to test three data storage structures in terms of data loading time, data storage space, and query execution time. The experimental results show that RCFile data storage structure can achieve better performance in most cases.
Keywords
data analysis; fault tolerant computing; public domain software; storage management; MapReduce framework; RCFile; big data analysis; column-store; data loading time; data storage space; data storage structure evaluation; fault-tolerant infrastructure; large clusters; open-source implementation Hadoop; query execution time; row-store; Open source software; Weaving; MapReduce; RCFile; column-store; data storage structure; row-stor;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science & Education (ICCSE), 2013 8th International Conference on
Conference_Location
Colombo
Print_ISBN
978-1-4673-4464-7
Type
conf
DOI
10.1109/ICCSE.2013.6554067
Filename
6554067
Link To Document